Stochastic generation of hourly mean wind speed data, Hafzullah

advertisement
Renewable Energy 29 (2004) 2111–2131
www.elsevier.com/locate/renene
Stochastic generation of hourly mean wind
speed data
Hafzullah Aksoy , Z. Fuat Toprak, Ali Aytek,
N. Erdem Ünal
Department of Civil Engineering, Civil Engineering Faculty, Istanbul Technical University,
Hydraulics Division, Maslak, 34469 Istanbul, Turkey
Received 19 September 2003; accepted 23 March 2004
Abstract
Use of wind speed data is of great importance in civil engineering, especially in structural
and coastal engineering applications. Synthetic data generation techniques are used in practice for cases where long wind speed data are required. In this study, a new wind speed data
generation scheme based upon wavelet transformation is introduced and compared to the
existing wind speed generation methods namely normal and Weibull distributed independent
random numbers, the first- and second-order autoregressive models, and the first-order Markov chain. Results propose the wavelet-based approach as a wind speed data generation
scheme to alternate the existing methods.
# 2004 Elsevier Ltd. All rights reserved.
Keywords: Normal distribution; Weibull distribution; Autoregressive models; Markov chain; Wavelet;
Hourly mean wind speed
1. Introduction and existing literature
Climatology is defined as a set of probabilistic statements on long-term weather
conditions [1], and wind climatology as that branch of climatology that specialises
in the study of winds, from which information on extreme winds is provided to
structural designers. Such information is also needed for wind energy producers
and engineers who design coastal civil structures, for example breakwaters. From a
structural engineering point of view, forecasting the maximum wind speed that is
Corresponding author. Tel.: +90-212-2856577; fax: +90-212-2856587.
E-mail address: haksoy@itu.edu.tr (H. Aksoy).
0960-1481/$ - see front matter # 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.renene.2004.03.011
2112
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
expected to affect a structure during its lifetime is important to the designer. On the
other hand, in coastal engineering practices, not only the magnitude but also the
directionality of wind becomes important. The duration of wind, in addition to its
magnitude and direction, is also required in wind energy production systems, and
the amount of energy that can be produced depends upon it.
The information required by either structural and coastal engineers or wind
energy producers is related to wind speed data, and is a matter of quality and quantity. The quality of the wind speed data refers to whether the data set is reliable and
micrometeorologically homogeneous. A data set is reliable if (i) the measurement
instrument performs adequately, (ii) the instrument is not influenced by obstructions
and (iii) the atmospheric stratification is neutral. A set of wind speed data is considered micrometeorologically homogeneous if the data set is obtained under identical micrometeorological conditions [1]. The size of the data set (quantity) is related
to the time period during which the wind speed data are recorded. The time period
over which wind speed data are recorded is usually shorter than the lifetime of civil
engineering structures. Therefore, the worst case of wind load that the structural
designer expects that the structure will face during its lifetime is determined by modelling the wind speed data record in hand. For this, climatological and physical
modelling techniques are available. Additionally, probabilistic and stochastic models
have been developed, for which the existing literature is reviewed in brief below. The
main aim in those techniques is to determine minimum design loads due to wind [2].
Short records of daily, weekly, and monthly highest wind speeds taken at 36
weather stations in the US were empirically analyzed [3] in order to determine
design wind speeds. Short records of hourly mean wind speed data from normal
regions in the US were used by Cheng and Chiu [4] for determination of the transition probabilities of the Markov chain upon which the methodology in that study
was based. This methodology was extended later to tropical cyclone-prone regions
[5]. Also, a knowledge-based expert system, principally similar to the mentioned
methodologies, was made available [6,7]. Alternative approaches used in the generation of simulated wind speed time series were compared by Kaminsky et al. [8].
Sfetsos [9] examined adaptive neuro-fuzzy inference systems and neural logic networks and compared them to the traditional autoregressive moving average
(ARMA) models. Dukes and Palutikof [10] employed the Markov chain in order to
estimate hourly mean wind speed with very long return periods. Another Markov
chain based study was conducted by Sahin and Sen [11]. Castino et al. [12] coupled
autoregressive processes to the Markov chain and simulated both wind speed and
direction. A recent study [13] presents a wavelet-based method to generate artificial
wind data. The Weibull distribution has commonly been fitted to hourly mean
wind speed data [14,15]. The peaks-over-threshold approach has also been commonly used in the estimation of extreme quantiles of wind speed data [16–19].
2. Methods
In this study, a number of probabilistic and stochastic methods are used in order
to compare their ability to reproduce long series of hourly mean wind speed data
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2113
with the same statistical behaviour as that of the observations in hand. The normal
and Weibull probability distribution functions are chosen in order to generate
independent and identically distributed random numbers. Autoregressive processes
are useful tools in generating data sets in cases where persistency exists. Persistency
means that large values tend to be followed by large values, and small values by
small values, so that runs of values of similar magnitude tend to persist throughout
the sequence. First- and second-order autoregressive processes are chosen in this
study. Another concept commonly employed in wind speed data generation studies
is the first-order Markov chain. The results of these methods are compared to
those obtained from a newly developed wavelet-based approach.
The methods are described below. Only the wavelet-based approach will be
detailed, whereas the remaining five methods will be outlined briefly as they have
been well documented in literature.
2.1. Normal distribution
Hourly mean wind speed time series are generated by using a sequence of independent random numbers from the normal distribution. The normal probability
distribution function is given by
1
f ðwÞ ¼ pffiffiffiffiffiffi exp½ðw lÞ2 =2r2 r 2p
ð1Þ
where w is the variable (hourly mean wind speed, in this study), l mean value of
wind speed, and r standard deviation of wind speed. A number of computational
methods are available for the generation of random numbers with normal probability distribution of mean l and standard deviation r.
2.2. Weibull distribution
The Weibull distribution is another probability distribution function commonly
used for the frequency analysis of wind speed data [14,15]. It is given by
a a1
1 a
f ðwÞ ¼ a w exp a w
w 0; a; b > 0
ð2Þ
b
b
where a and b are shape and scale parameters, respectively, that can be determined
by using either a graphical method or the method of moments. They can also be
determined using the method of probability weighted moments (PWMs) for which
explicit equations are available. It is the method used in this study for the determination of parameters.
Equations to be used for this purpose are given by
lnð2Þ
L2;ðln wÞ
0:5772
b ¼ exp L1;ðln wÞ þ
a
a¼
ð3aÞ
ð3bÞ
2114
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
In Eqs. (3a,b), L1,(ln w) and L2,(ln w) are L1 and L2 moments of the logarithm of
the hourly mean wind speed time series. The L1 and L2 moments of a series are
given by
L 1 ¼ b0
ð4aÞ
L2 ¼ 2b1 b0
ð4bÞ
in which b0 and b1 are given by
b0 ¼ x
b1 ¼
N
1
X
j¼1
ð5aÞ
ðN jÞ
xj
NðN 1Þ
ð5bÞ
xj in Eq. (5b) comes from the time series sorted in descending order as
xN xi x1 . Detailed information on L moments and the method of
PWM is given in [20].
Once the parameters are determined, the generation of Weibull distributed random numbers is a matter of a simple computer code, as the cumulative distribution
function of the Weibull distribution can be obtained in closed form.
2.3. AR(1) model
The hourly mean wind speed time series is of high dependence. This property
particularly requires a wind speed data generation model incorporating the dependence structure of the observations. As mentioned, both normal and Weibull distributed random numbers do not take this property into account as they are
independent, but autoregressive models are of correlated type and hence capable of
simulating this property of the data series.
The use of autoregressive type models is reported in literature very commonly.
The first-order autoregressive [AR(1)] model accommodates only the effect of the
previous value in the series in which the observed sequence of wind speed data {w1,
w2,. . ., wt,. . .} is used to fit a model of form
wi ¼
m
X
aj wij þ ei
ð6Þ
j¼1
where w is the hourly mean wind speed, a the autoregressive coefficient, that is,
model parameter, and e a normally distributed independent random variable. It is
noted that Eq. (6) is written for the mth order. The simplest case of Eq. (6) is
obtained for m ¼ 1, which is also called the Markov model. Eq. (6) then becomes
yi ¼ r1 yi1 þ ei
ð7Þ
where y is the standardised (zero mean and unit variance) version of the variable
and r1 the lag-one serial correlation coefficient of the sequence.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2115
The random component (e) in AR(1) is of normal distribution with zero mean
and a variance of 1 r21 . The simulation procedure for the processes is very simple. It requires only a random number of a normal distribution to be generated.
2.4. AR(2) model
With increase in order of the autoregressive model, the dependence structure in
the observations is better preserved. Therefore, the second-order autoregressive
[AR(2)] model is preferred to AR(1). This becomes more important in cases where
dependence in the data set is very obvious, as in the hourly mean wind speed data.
AR(2) is formulized as
yi ¼ /1 yi1 þ /2 yi2 þ ei
ð8Þ
where autoregressive coefficients /1 and /2 are given by
/1 ¼ r1 ð1 r2 Þ=ð1 r21 Þ
ð9aÞ
/2 ¼ ðr2 r21 Þ=ð1 r21 Þ
ð9bÞ
in which r1 is the lag-one autocorrelation coefficient and r2 the lag-two autocorrelation coefficient of the wind speed time series. The random component in AR(2) is
again of normal distribution, with zero mean and variance equal to 1R2, where
R2 ¼
r21 þ r22 2r1 r2
1 r21
ð10Þ
2.5. Markov chain
In this approach, the observed time series is divided into a number of states. A
wind speed state contains wind speeds between certain values. For example, State 1
might include wind speeds below 2 m/s, State 2 wind speeds between 2 and 4 m/s,
etc. until the final wind speed state includes all speeds above the highest observed
value or a predefined upper limit. The upper and lower limits of the states are
highly subjective values. For instance, the hourly mean wind speed data set in this
study was divided into 10 states. In another wind speed study [11], states were
defined depending upon the standard deviation of the data set. Each state in that
study [11] was taken as wide as one standard deviation of the observed hourly
mean wind speed time series. Dukes and Palutikof [10], on the other hand, used a
fixed width for the states, which was equal to 2 m/s.
In the Markov chain approach, the state of wind speed in the current hour can
be defined depending only upon the previous state. This is called the first-order or
one-step Markov chain. Two previous states are used in the second-order or twostep Markov chain in determining the current state of the wind speed. Although
they are not common as the first- and second-order Markov chains, higher-order
Markov chains can also be used. However, dramatic increase in the number of
their parameters limits their use.
2116
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
The parameter set of a Markov chain consists of probabilities of transition from
one state to another that are given in transition probability matrices. The transition probability matrix of a first-order Markov chain with m states can be written
symbolically as
2
3
P11 P12 . . . P1m
6 P21 P22 . . . P2m 7
7
P¼6
ð11Þ
4 . . . . . . Pij . . . 5
Pm1 Pm2 . . . Pmm
where Pij is the probability of transition from state i to state j. The number of
parameters is mðm 1Þ, as the sum of the probabilities is equal to 1 (100%) for
each row of the matrix. If nij is the total number of hours of observation in state j
with the previous state i, the probabilities of transition from state i to state j can be
calculated as
nij
Pij ¼ P
nij
i; j ¼ 1; 2; . . . ; m
ð12Þ
j
The procedure for generating the simulated hourly mean wind speed time series
is explained below.
First, the cumulative transition probability matrix is calculated. In the cumulative
transition probability matrix, cumulative summation of probabilities within each row
is carried out; hence, each row in that matrix ends with 1. Then, an initial state is
adopted. No wind (State 1) can, for example, be assumed as the initial state. Using a
uniform random number, the next state of wind speed can be determined. If State 1 is
obtained as the new state of wind speed, then it is first checked if the wind speed is
zero. If the wind speed is not zero, then a uniform random number is generated from
the interval of State 1. If the highest state is found to be the new state of wind speed,
then a shifted one-parameter gamma distributed random number is used in order to
find the magnitude of the wind speed. The reason for choosing the gamma distribution
will be discussed in the section where results obtained from application of the methods
are presented. For intermediate states, a uniform random number from the interval of
the corresponding state is generated and set as the wind speed at the current hour.
2.6. Wavelet-based approach
A real or complex-value continuous function with zero mean and finite variance
is called a wavelet [21]. There are many functions that can qualify as wavelets.
Some examples of wavelets are Morlet, Mexican hat, Shannon and Meyer. A simple wavelet is the Haar wavelet (Fig. 1), defined as
8
< 1 0 t 1=2
wðtÞ ¼ 1 1=2 t 1
ð13Þ
:
0 otherwise
Decomposing a signal and then reconstructing it is the base for the wavelet
transform. In this study, the Haar wavelet was used due to its simplicity. There-
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2117
Fig. 1. Haar wavelet.
fore, decomposition of a signal (multiresolution analysis) with the Haar wavelet is
considered and explained in detail below.
For a certain value of k, let us define fk(t) as the average of f(s) over an interval
of size 2k:
ð k
1 2 ðlþ1Þ
fk ðtÞ ¼ k
f ðsÞds 2k l < t < 2k ðl þ 1Þ
ð14Þ
2 2k l
where k and l are integers, k a scale variable (k > 0 means stretching and k < 0
means contracting of the wavelet) and l a translation variable [21]. For
k ¼ 1,. . .,1, 0, 1,. . ., 1, fk(t) is as follows:
f1 ðtÞ ¼ f ðtÞ
..
.
ð ðlþ1Þ=2
l
lþ1
f1 ðtÞ ¼ 2
<t<
f ðsÞds
2
2
l=2
ð ðlþ1Þ
f0 ðtÞ ¼
f ðsÞds l < t < ðl þ 1Þ
l
ð 2ðlþ1Þ
1
f ðsÞds 2l < t < 2ðl þ 1Þ
f1 ðtÞ ¼
2 2l
..
.
f1 ðtÞ ¼ 0
ð15Þ
2118
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
The resolution decreases as k increases. The difference between the successive
averages fk1(t) and fk(t) is defined as a detail function:
gk ðtÞ ¼ fk1 ðtÞ fk ðtÞ
It can be easily seen that
1
X
gk ðtÞ
f ðtÞ ¼
ð16Þ
ð17Þ
k¼1
According to Eq. (17), the original signal is obtained when all detail functions
are summed up. Change in data resolution with change in k, the resolution level,
can be seen in the upper part of Fig. 2, in which the average of the time series
taken at different resolution levels according to Eq. (15) is shown. Note that the
data sample used in Fig. 2 has 16 elements. Increase in the ordinates of fk(t) with
decrease in k shows the change (increase) in the resolution. The middle part of
Fig. 2 shows the detail functions calculated using Eq. (16) for different resolution
levels. Note from Eq. (15) that f 4 ðtÞ ¼ 0 for all t. At the bottom of Fig. 2, f(t), the
sum of the four detail functions according to Eq. (17), is seen, and it represents the
original data, f0(t). Eq. (17) is the basis for the generation algorithm explained
below.
Let us consider a data sample of size M ¼ 2K , where K is a positive integer
(K ¼ 4 for the sequence in Fig. 2) taken from a stochastic process f(t) with zero
mean: f(1), f(2),. . ., f(M). Define the sample fk(i) (k ¼ 0, 1,. . ., K; i ¼ 1,. . ., M) consisting of averages of 2k successive elements of the sample. f0(i) is the original sample and fK(i) is a sample of all zeros, since the average of M elements is zero. The
detail function gk(t) has a sample consisting of M elements given by Eq. (16) for
k ¼ 1, 2,. . ., K.
Thus, for each element fi of the original sample, we have K detail function
values, gk(i), corresponding to different resolutions. Choosing from M elements for
each gk(t) randomly, and then summing them up using Eq. (17), one obtains a
simulated value for f(t) as
f ðjÞ ¼
K
X
gk ðjÞ
ð18Þ
k¼1
where j is the index for generated elements.
The generation algorithm is given step by step as follows [22] and is illustrated in
Fig. 3 for K ¼ 4.
1. In order to obtain the first element of the series ( j ¼ 1), gk values (k ¼ 0, 1,. . .,
K) are chosen from M values randomly and summed up to obtain f1 (Fig. 3).
2. The second element ( j ¼ 2) is generated by choosing, for each k, the gk coming
just after the gk values chosen in the first step. f2 is obtained by the summation
of these (Fig. 3).
3. Data generation is continued in this way for a desired number of times using,
for the generation of each element fj, the detail function values right next to
those of the previous step j1 at each resolution level.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
Fig. 2. Decomposition and reconstruction of a data sequence.
2119
2120
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
Fig. 3. Construction of a simulated data sequence.
This generation algorithm is a newly developed approach for data simulation
purposes. It was first used in non-skewed annual and monthly streamflow data
simulation studies [22,23]. The approach was later used for the simulation of the
storage capacity of river reservoirs [24]. Modelling suspended sediment discharge
series [25] and annual and monthly rainfall data series [26] was also performed by
this approach successfully. The algorithm generated the mean, standard deviation
and correlation structure of the observed streamflow data sets. When one is interested in the generation of skewed data, it is first required to transform the data to
a non-skewed structure, generate them and then transform them back to their
skewed structure.
3. Application
The methods were applied to an hourly mean wind speed data set that will be
introduced in the following subsections. Results obtained from the application of
the methods are presented and discussed below. The performance of the methods
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2121
was measured according to their ability to capture the statistical behaviour of the
observed data set. A comparison of the methods is finally presented.
3.1. Data
Table 1 shows the main statistical characteristics of the data set of hourly mean
wind speed taken from the State Meteorological Works’ meteorology station in
Diyarbakir, a southeastern Anatolian city. The data set is of four years’ length,
from 1994 to 1997 (35064 hours in total). The region is normal, as is seen in
Table 1. The data set is highly correlated, as expected, and skewed. For the wavelet-based approach, 32768 hours of data, extending from the first hour of April 6,
1994, to the eighth hour of December 31, 1997, were used. This is a choice with no
specific reason. Characteristics corresponding to that part of the observed series are
also given in Table 1.
3.2. Parameters
The hourly mean wind speed data set used in the study is of skewed structure.
This prevents fitting of the normal distribution to the data. Therefore, power transformation [ y ¼ xh ; where x the is raw (untransformed) variable, y the transformed
variable, and h the transformation coefficient] was adopted in order to obtain nonskewed data, to which the normal distribution can be fitted. The transformation
coefficient was obtained as h ¼ 0:38585 for the data set in the study. As the normal
distribution is fitted to the transformed hourly mean wind speed time series (but
not to the raw data series), the parameters of the normal distribution are the mean
and standard deviation of the transformed hourly mean wind speed time series.
Those parameters are presented in Table 2. The normal probability distribution
function based upon the determined parameters was fitted to the transformed wind
speed data series (Fig. 4). It is seen that the distribution performs very well in fitting to the observations as well as to the generated data, to be explained later in
following sections.
The Weibull distribution has two parameters (a, the shape parameter, and b, the
scale parameter). The parameters were determined using the method of L-moments
on which detailed information was given previously. The reason for choosing this
method is that explicit equations are available for determination of the parameters
of the distribution. The method also has the superiority of being less sensitive to
outliers, which means that outliers do not affect the performance of the method in
determining the parameters correctly. The only problem with this method is the
presence of zero wind speeds, which makes the method inapplicable due to the logarithm included. In order to overcome this problem, zero wind speeds were ignored
from the observed time series as their number of occurrences was very small, less
than 0.5%. The parameters of the Weibull distribution determined by the method
of L-moments are listed in Table 2. Fig. 5 shows the agreement between the
observed data and the fitted Weibull probability distribution function. It can be
considered a very good fit, although the Weibull probability distribution function
2.555
6
April
1994–31 32768
December 1997
Mean
(m/s)
2.538
Number
of data
1 January 1994–31 35064
December 1997
Date
1.794
1.786
Standard
deviation
(m/s)
0.702
0.703
Coefficient
of variation
Table 1
Statistical characteristics of observed hourly mean wind speed time series
1.283
1.285
Coefficient
of skewness
14.4
14.4
Maximum
wind speed
(m/s)
0.861
0.860
r1
0.733
0.732
r2
0.635
0.633
r3
Correlation coefficient
0.551
0.549
r4
0.438
0.476
r5
2122
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2123
Table 2
Parameter sets of methods
Method
Parameter set
Normal
Weibull
AR(1), AR(2)
l ¼ 1:347 m=s
a ¼ 1:583
r1 ¼ 0:820
r ¼ 0:392 m=s
b ¼ 1:973
r2 ¼ 0:688
gives the mode an occurrence probability slightly lower than that in the observation.
AR(1) is a parametric model with two parameters (a, the autoregression coefficient, and r2e , the variance of the independent normal variable). The model
requires only the lag-one serial correlation coefficient (r1), as both parameters are
dependent only upon r1.
AR(2) has three parameters (/1 and /2, the autoregression coefficients, and r2e ,
the variance of the independent normal variable), all functions of r1 and r2, the lagone and lag-two serial correlation coefficients listed in Table 2.
Of the six methods, the Markov chain is the one that requires the highest number of parameters. The number of parameters required changes with the number of
states used for the wind speed. In this study, 10 states were chosen for the wind
speed, each 1.5 m/s wide. This resulted in 90 transition probabilities to be determined from the observed wind speed data set, when it is considered that summation
Fig. 4. Normal probability distribution function fitted to the observed and simulated random wind
speed sequences.
2124
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
Fig. 5. Weibull probability distribution function fitted to the observed and simulated random wind
speed sequences.
over any row in the transition probability matrix results in 100% probability. The
transition probability matrix of the data set is given in Table 3. Not only transition
probabilities, but also the wind speed distribution in each state should be known
by this method. In this study, wind speed was assumed to be distributed uniformly
over the states except for the last one (state of highest wind speeds with no upper
limit), where the one-parameter gamma distribution was used. In State 1 with the
lower limit of zero, the probability of occurrence of zero wind speed was also taken
Table 3
Transition probability matrix of the observed hourly mean wind speed data set
Pij
j¼1
2
3
4
5
6
7
8
9
10
i¼1
2
3
4
5
6
7
8
9
10
0.7053
0.2405
0.0256
0.0042
0.0008
0.0000
0.0000
0.0000
0.0000
0.0000
0.2779
0.6089
0.2839
0.0491
0.0176
0.0089
0.0152
0.0000
0.0000
0.0000
0.0144
0.1306
0.5317
0.3116
0.0865
0.0266
0.0076
0.0000
0.0000
0.0000
0.0015
0.0153
0.1352
0.4954
0.3486
0.1197
0.0455
0.0526
0.0000
0.0000
0.0008
0.0041
0.0178
0.1191
0.4311
0.3437
0.1212
0.0000
0.0000
0.0000
0.0001
0.0005
0.0048
0.0179
0.0978
0.4013
0.3561
0.2105
0.0000
0.0000
0.0000
0.0000
0.0005
0.0023
0.0168
0.0865
0.3485
0.3684
0.1818
0.0000
0.0000
0.0001
0.0003
0.0003
0.0008
0.0111
0.1061
0.2632
0.3636
0.0000
0.0000
0.0000
0.0002
0.0000
0.0000
0.0022
0.0000
0.0789
0.3636
1.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0263
0.0909
0.0000
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2125
into consideration in order to reproduce the zero wind speeds, although their
occurrence was very low.
There is no parameter to be listed for the wavelet approach, as it is a nonparametric method. The length of the series to be used in this method is equal to 2K,
where K is a positive integer and equal to 15 in this study. This corresponds to a
series 32768 hours in length. Another requirement for the wavelet approach is that
the data set should be of a non-skewed structure. Therefore, the part of the
observed series used for the wavelet approach was transformed by using the power
transformation with h ¼ 0:3853.
3.3. Simulation and results
A thousand-year (8760000-hour) -long series was generated for each method.
The correlogram, frequency distribution of maximum wind speeds and wind duration curve obtained from the simulations will be compared to those of the
observed series.
It is obvious that the hourly mean wind speed time series has a highly dependent
structure. The normal and Weibull distributions, however, are of independent
structures (Fig. 6), yet they are very common methods used in generating wind
speed data. These methods may be useful in offering, to the structural designer, the
highest wind speed that the structure will possibly face during its lifetime.
It is seen from Fig. 4 that wind speed data generated by the normal distribution
fit the observed series very well. It is seen in Fig. 5 that the Weibull fit is perfect as
well.
Other than those two methods, the AR(1), AR(2), and Markov chain methods
looked to produce the dependence structure of the series. However, with increasing
lags in time, the success of those methods in reproducing the correlation structure
Fig. 6. Correlogram of the observed and simulated wind speeds.
2126
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
Fig. 7. Cumulative frequency diagram of maximum wind speeds.
of the series decreases (Fig. 6). The wavelet method of the six studied, was found
to be the best in preserving the correlation structure of the series.
The annual maximum values of the simulation series were compared in Fig. 7. It
is seen that the normal distribution, AR(1) and AR(2) produced similar maxima,
whereas the wavelet approach produced higher, the Markov chain slightly lower
and the Weibull distribution considerably lower maxima. From the structural
engineering point of view, therefore, it is safer to use the wind load due to the
maximum wind speed generated by the wavelet-based approach.
Maxima obtained from the Markov chain method should be discussed specifically. There are three vertical jumps (one of them is very obvious) in the cumulative frequency diagram of the maxima of this method, as is seen in Fig. 7. The
reason for those jumps can be explained very simply. It is seen from Table 3 that
the probabilities of transition of the wind speed to the highest states are too low,
making transition of wind speed to those states almost impossible in the simulation
series. It is only possible to make a transition to State 10 if the previous state in the
simulation series is either State 8 or State 9. Otherwise State 10 is not simulated.
This causes the maximum value of the series to be bounded by the upper limit of
State 9, which was taken as 13.5 m/s in this study. A very small jump exists in the
frequency curve in Fig. 7 due to this circumstance. Similarly, State 9 can be simulated if and only if the previous state of the wind speed is one of the following
states: 3, 6, 8, 9 and 10 (Table 3). The big jump in the cumulative frequency curve
in Fig. 7 is due to this situation. It is seen that maximums of the simulated series
are bounded by the upper limit of State 8, which was taken as 12 m/s in this study.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2127
The third jump at the very beginning of the curve (close to the y-axis of the graph)
is due to a similar situation. This is the result of not having simulated wind speeds
from State 8, which limits the maximum wind speed to 10.5 m/s at the upper limit
of State 7. This drawback of the method can be overcome by forcing the simulation series to have at least one value from the highest state that results in a
maximum wind speed data series all generated from the highest state with no upper
limit. Such a forcing can be considered quite reasonable and it does not affect the
transition probability matrix as the number of data is usually very large (of the
order of tens of thousands).
Uniformly distributed wind speeds were accepted for the intermediate states,
whereas the one-parameter gamma distribution was adopted for the highest state
(State 10 in this study). The reason for choosing this distribution is explained
below together with a discussion on other distributions.
A distribution with no upper limit should be used for the highest state so that
maximum wind speeds higher than those in the observed series can possibly be
generated. Therefore, in this study, it was first thought to simulate wind speeds of
the highest state by using the exponential distribution shifted to that state as Sahin
and Sen [11] did. This is quite a reasonable choice for simulating the wind speeds
in that state. However, it was seen that the exponential distribution generated
lower maximum wind speeds compared to those generated by other methods.
Therefore, the Gumbel distribution was tested. It was seen that the maximum wind
speeds generated by this distribution were too low compared to those obtained by
the other methods. The Frechet distribution, which is accepted as the distribution
of the maximum wind speeds [1], was also found to be unsuccessful in generating
maximum wind speeds compared to other methods. The distribution generated low
maximum wind speeds. In the end, the two-parameter gamma distribution was fitted, which resulted again in low maximum wind speeds. Finally, the one-parameter
gamma distribution was fitted and results comparable to those of the other methods (in Fig. 7) were obtained.
The conclusion that can be drawn from those trials is that a one-parameter distribution can fit to the highest state better than distributions with two or more
parameters. If the standard deviation of the highest state, which is bounded by the
lower and upper limits of the state, is included in the generation scheme, then
lower maximum wind speeds are generated. Therefore, mean-dependent probability
distribution functions are better in the simulation of maximum wind speeds.
The transition probability matrix of the Markov chain based simulation wind
speed series is given in Table 4. It is almost the same as its observed counterpart
given in Table 3, which means that the Markov chain based simulation technique
worked very well in the simulation of the state of the wind speed series.
The wind duration curve is a graph with time percentage as abscissa and wind
speed as ordinate (Fig. 8). It is an important tool used in determining the percentage of time that the wind speed exceeds a specified level. Wind energy production
systems use this graph in order to determine the wind energy potential of the
region under consideration. A very good fit was obtained in Fig. 8, where the wind
duration curves of the six methods were plotted together with the one extracted
2128
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
Table 4
Transition probability matrix of hourly mean wind speed data simulated by Markov chain method
Pij
j¼1
2
3
4
5
6
7
8
9
10
i¼1
2
3
4
5
6
7
8
9
10
0.7049
0.2406
0.0255
0.0042
0.0008
0.0001
0.0001
0.0000
0.0000
0.0000
0.2782
0.6086
0.2845
0.0491
0.0181
0.0089
0.0156
0.0000
0.0000
0.0000
0.0143
0.1309
0.5318
0.3115
0.0860
0.0259
0.0077
0.0000
0.0000
0.0000
0.0016
0.0153
0.1346
0.4958
0.3474
0.1206
0.0478
0.0501
0.0000
0.0000
0.0009
0.0041
0.0179
0.1191
0.4322
0.3421
0.1227
0.0000
0.0000
0.0000
0.0001
0.0005
0.0048
0.0177
0.0982
0.4028
0.3526
0.2197
0.0000
0.0000
0.0000
0.0000
0.0005
0.0023
0.0165
0.0861
0.3491
0.3667
0.1888
0.0000
0.0000
0.0001
0.0003
0.0003
0.0008
0.0113
0.1045
0.2606
0.3619
0.0000
0.0000
0.0000
0.0002
0.0000
0.0000
0.0022
0.0000
0.0777
0.3720
1.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0252
0.0773
0.0000
from the observed series. Although the wind duration curve of the Markov chain
method fluctuates around the others, it has a fit that is good enough as well.
The first three central moments of the observed and simulated series are given in
Table 5. The maximum values and the first five lags of the correlation are also listed. It is seen that the mean values of the simulated series are almost the same as
those of the observations. The wavelet-based method approaches its counterparts
with a relative error of 0.3%. Standard deviation and variation coefficient were
best captured by the normal probability distribution, and AR(1) and AR(2) processes. Skewness coefficient in the wind speed time series was best reproduced by
Fig. 8. Wind duration curve of observed and simulated wind speeds.
Mean
(m/s)
2.538
2.529
2.537
2.537
2.585
2.566
Series
Normal
Weibull
AR(1)
AR(2)
Markov
Wavelet
1.777
1.634
1.776
1.776
2.058
1.833
Standard
deviation
(m/s)
Table 5
Statistical characteristics of simulated series
0.700
0.646
0.700
0.700
0.796
0.714
Coefficient
of variation
1.315
0.980
1.309
1.313
0.983
1.438
Coefficient
of skewness
26.94
16.22
27.30
25.23
21.29
31.13
Maximum
wind speed
(m/s)
0.0005
0.0003
0.815
0.815
0.715
0.715
r1
r3
0.0005
0.0003
0.536
0.561
0.483
0.523
r2
0.0002
0.0005
0.661
0.677
0.587
0.580
Correlation coefficient
0.0002
0.0001
0.436
0.466
0.399
0.443
r4
0.0002
0.0007
0.355
0.387
0.330
0.421
r5
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2129
2130
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
AR(1). Higher maximums were obtained by the methods of AR(1), wavelet and
normal distribution and lower maximums by Weibull distribution. Correlation
structure, as discussed earlier, was best simulated by the wavelet-based method.
4. Summary and conclusion
In this study, hourly mean wind speed data sets were generated by traditional
simulation methods—the normal and Weibull probability distribution functions,
the first- and second-order autoregressive processes, and the Markov chain.
Additionally, the newly developed wavelet-based approach was used. The normal
and Weibull probability distribution functions consist of independent identically
distributed random numbers. The autoregressive models include the correlation
structure of the observation and hence generate dependent series. The Markov
chain is a two-step method that first determines the state of the wind speed and
then generates its magnitude by using a preselected distribution. All the mentioned
methods are parametric and they therefore require the time series to have a specific
probability distribution. This is a drawback of parametric models more than a
limitation. A nonparametric model, of which the wavelet approach in this study is
one of the best examples, can be applied to data sets with any distribution. However, it should be kept in mind that the wavelet approach works only with sequences of zero skewness.
The correlation structure of the observations, distribution of the maximum wind
speeds, wind duration curve and statistical features of the series were used in order
to compare the success of the methods.
The generation of maximum wind speeds requires special attention in Markov
chain based simulation methods. Based upon the application in this study, it is
concluded that the uniform probability distribution function is suitable for use in
the first and intermediate states. A probability distribution function with no upper
limit should be used for the highest state. It is concluded that the one-parameter
gamma distribution is good enough in fitting to the wind speed data in the highest
state of the series for normal regions, such as the one used in this study.
Some methods performed better in preserving some particular characteristics
than other methods did. For example, the wavelet method is obviously the best in
preserving the correlation structure of the sequence. This method is as good at preserving other statistical features of the series as other methods. Therefore, in conclusion, the wavelet method is proposed as a tool to substitute for the classical
generation schemes for the simulation of hourly mean wind speed data.
Acknowledgements
The wavelet approach presented in this study is a result of an earlier cooperation
between the first author (H. Aksoy) and Professor M. Bayazit of Istanbul Technical University, Turkey, whom the authors sincerely thank.
H. Aksoy et al. / Renewable Energy 29 (2004) 2111–2131
2131
References
[1] Simiu E, Scanlan RH. Wind effects on structures. New York: John Wiley & Sons; 1986.
[2] American Society of Civil Engineers. Minimum design loads for buildings and other structures.
ANSI/ASCE 7-93 (Revision of ANSI/ASCE 7-88), New York, 1994.
[3] Simiu E, Filliben JJ, Shaver JR. Short-term records and extreme wind speeds. ASCE, Journal of
the Structural Division 1982;108(ST11):2571–7.
[4] Cheng EDH, Chiu ANL. Extreme winds simulated from short-period records. ASCE, Journal of
Structural Engineering 1985;111(1):77–94.
[5] Cheng EDH, Chiu ANL. Extreme winds generated from short records in a tropical cyclone-prone
region. Journal of Wind Engineering and Industrial Aerodynamics 1988;28:69–78.
[6] Cheng EDH. Wind data generator: a knowledge-based expert system. Journal of Wind Engineering
and Industrial Aerodynamics 1991;38:101–8.
[7] Cheng EDH, Chiu ANL. An expert system for extreme wind simulation. Journal of Wind Engineering and Industrial Aerodynamics 1990;36:1235–43.
[8] Kaminsky FC, Kirchhoff RH, Syu CY, Manwell JF. A comparison of alternative approaches for
the synthetic generation of a wind speed time series. Transactions of the ASME 1991;113:280–9.
[9] Sfetsos A. A comparison of various forecasting techniques applied to mean hourly wind speed time
series. Renewable Energy 2000;21:23–35.
[10] Dukes MDG, Palutikof JP. Estimation of extreme wind speeds with very long return periods. Journal of Applied Meteorology 1995;34:1950–61.
[11] Sahin AD, Sen Z. First-order Markov chain approach to wind speed modelling. Journal of Wind
Engineering and Industrial Aerodynamics 2001;89:263–9.
[12] Castino F, Festa R, Ratto CF. Stochastic modelling of wind velocities time series. Journal of Wind
Engineering and Industrial Aerodynamics 1998;74–76:141–51.
[13] Kitagawa T, Nomura T. A wavelet-based method to generate artificial wind fluctuation data. Journal of Wind Engineering and Industrial Aerodynamics 2003;91:943–64.
[14] Garcia A, Torres JL, Prieto E, De Francisco A. Fitting wind speed distributions: a case study.
Solar Energy 1998;62(2):139–44.
[15] Grigoriu M. Estimates of design wind from short records. ASCE Journal of the Structural Division
1982;108(ST5):1034–48.
[16] Heckert NA, Simiu E, Whalen T. Estimates of hurricane wind speeds by ‘peaks over threshold’
method. ASCE Journal of Structural Engineering 1998;124(4):445–9.
[17] Lechner A, Simiu E, Heckert NA. Assessment of ‘peaks over threshold’ methods for estimating
extreme value distribution tails. Structural Safety 1993;12:305–14.
[18] Pandey MD, Van Gelder PHAJM, Vrijling JK. The estimation of extreme quantiles of wind velocity using L-moments in the peaks-over-threshold approach. Structural Safety 2001;23:179–92.
[19] Simiu E, Heckert NA. Extreme wind distribution tails: a ‘peaks over threshold’ approach. ASCE,
Journal of Structural Engineering 1996;122(5):539–47.
[20] Stedinger JR, Vogel RM, Foufoula-Georgiou E. Frequency analysis of extreme events. In: Maidment
D, editor. Handbook of hydrology. New York: McGraw Hill Book Co; 1993 [Chapter 18].
[21] Rao RM, Bopardikar AJ. Wavelet transforms, introduction to theory and applications. Reading,
MA: Addison-Wesley; 1998.
[22] Bayazit M, Aksoy H. Using wavelets for data generation. Journal of Applied Statistics 2001;28(2):
157–66.
[23] Bayazit M, Onoz B, Aksoy H. Nonparametric streamflow simulation by wavelet or Fourier analysis. Hydrological Sciences Journal 2001;46(4):623–34.
[24] Aksoy H. Storage capacity for river reservoirs by wavelet-based generation of sequent peak algorithm. Water Resources Management 2001;15(6):423–37.
[25] Aksoy H, Akar T, Unal NE. Wavelet analysis for modeling suspended sediment discharge. Nordic
Hydrology 2004;35:165–74.
[26] Unal NE, Aksoy H, Akar T. Annual and monthly rainfall data generation schemes. Stochastic
Environmental Research and Risk Assessment 2044;18(6):in press.
Download