A machine learning approach for analyzing and predicting wind

advertisement
1
A machine learning approach for analyzing and predicting wind turbine response to
2
diurnal and nocturnal wind fields
3
J. Park1, K. Smarsly2, K. H. Law1 and D. Hartmann3
4
5
6
1
Department of Civil and Environmental Engineering, Stanford University, 473 Via
Ortega, Stanford, CA 94305, USA; email: {jkpark11, law}@stanford.edu
7
8
2
9
10
Department of Civil Engineering, Bauhaus University Weimar,
Coudraystr. 7, 99423 Weimar, GERMANY; email: kay.smarsly@uni-weimar.de
11
12
13
3
Department of Civil and Environmental Engineering, Ruhr University Bochum,
Universitätsstr. 150, 44780 Bochum, GERMANY; email: hartus@inf.bi.rub.de
14
15
ABSTRACT
16
17
Site- and time-specific wind field characteristics have a significant impact on the
18
structural response and the lifespan of wind turbines. This paper presents a machine
19
learning approach towards analyzing and predicting the response of a wind turbine
20
structure to diurnal and nocturnal wind fields. Machine learning algorithms are applied (i)
21
to better understand the changes of wind field characteristics due to atmospheric
22
conditions, and (ii) to gain insights into the wind turbine loads being affected by the wind
23
field. Using Gaussian Mixture Models, the variations in wind field characteristics are
1
24
investigated by comparing the joint probability density functions of selected wind field
25
features. The wind field features are constructed from long-term monitoring data taken
26
from a 500 kW wind turbine in Germany that is used as a reference system. Furthermore,
27
employing Gaussian Discriminant Analysis, representative daytime and nocturnal wind
28
turbine loads are compared and analyzed.
29
30
INTRODUCTION
31
32
Stochastic wind effects can cause fluctuations in the wind turbine responses such as loads,
33
displacement, fatigue damages and power outputs. Typically, recorded site-specific wind
34
field data are described by time-averaged statistics, such as mean wind speed, mean wind
35
direction, turbulence intensity and power exponent describing the vertical profile of the
36
mean wind speed. In addition to the mean statistics, the distributions of such wind
37
parameters are important factors in evaluating the wind turbine responses. Currently, the
38
wind turbine design standard of the International Electrotechnical Commission (IEC
39
61400-1, 2005) uses the Weibull probability distribution to describe only the mean wind
40
speed. In this study, we describe site-specific wind conditions by using multivariate
41
probability density functions (PDFs) for wind parameters. We implement Gaussian
42
Mixture Models (GMMs) to construct the PDFs to study how wind field characteristics
43
vary over the day and night time periods.
44
45
Analyzing wind turbine responses is difficult due to the complex interactions between a
46
wind turbine structure and the stochastic wind flow. Describing the input wind field is
2
47
even more difficult. With recent advances in sensor technologies, however, a wind
48
turbine can now be continuously monitored and the measurements data can be used to
49
gain a better understanding on the dynamic responses of the wind turbine structure. Using
50
the measured wind field as input data and the wind turbine response as output data, the
51
relationship between them can be studied by applying supervised machine learning
52
techniques. In this study, we investigate the influence of wind fields on wind turbine
53
response by constructing the response classification models using Gaussian discriminant
54
analysis. By combining the variation of a wind field captured by the PDF of the wind
55
features and the influences of wind fields on the wind turbine captured by the response
56
classification model, new information can be generated about how different atmospheric
57
conditions affect wind turbine responses. In this study, we elucidate the reason for the
58
different levels of wind turbine response at day and night time periods by relating the
59
time-specific PDFs for the wind features and the wind turbine response classification
60
functions. The machine learning approach is demonstrated using the data sets collected
61
from a 500 kW wind turbine located in Germany.
62
63
This paper is organized as follows: First, the integrated life-cycle management (LCM)
64
framework implemented for the 500 kW wind turbine is presented. The system, which
65
includes two major components, namely a “structural health monitoring (SHM) system”
66
and a “decentralized software system”, has been in operation for over four years. The
67
methodologies for the proposed machine learning approaches are then described. The
68
basic features of Gaussian Mixture Models (GMMs) are introduced and the application of
69
GMMs to construct the multivariate probability density function for the collected wind
3
70
field data is presented. The use of Gaussian discriminant analysis to study the effect of
71
wind field on wind turbine response is discussed. The paper is concluded with a brief
72
summary and discussion.
73
74
WIND TURBINE LIFE-CYCLE MANAGEMENT FRAMEWORK
75
76
This study is based on the monitoring data collected on a 500 kW wind turbine, located in
77
Germany (see Figure 1). The wind turbine has a hub height of 65 m and upwind rotor
78
diameter of 40.3 m. A life cycle management (LCM) framework has been developed for
79
the monitoring and management operation of the wind turbine. The LCM framework, as
80
shown in Figure 2, consists of six interconnected components that are installed at
81
geographically distributed locations:
82
83
1.
A SHM system comprises of sensors, several data acquisition units as well as a
84
local computer.
The SHM system is installed on the wind turbine to
85
continuously collect not only structural, environmental but also operational
86
data.
87
88
2.
A decentralized software system is installed on different computers at the
89
Institute for Computational Engineering (ICE) in Bochum, Germany. The
90
software system is remotely connected to the SHM system to continuously
91
collect, process and archive the recorded data. Additionally, software modules
4
92
have been developed to support automated data management and provide
93
Internet-enabled remote access to the monitoring data.
94
3.
An agent-based self-diagnostic system for detecting sensor malfunctions has
been implemented to ensure reliability and availability of the SHM system.
95
96
97
4.
A model updating component is included to continuously update the
98
computational wind turbine models and to support damage detection analyses
99
based on continuously updated monitoring data.
100
101
5.
A management module is incorporated to enable online life-cycle
102
management through remote data access and analyses of the structural,
103
environmental, and operational conditions of the wind turbine.
104
105
6.
A 32-node PC cluster, also located at the ICE in Bochum, is integrated into
106
the LCM framework to provide high-performance parallel computing
107
capabilities for solving computationally intensive calculations.
108
5
109
110
Figure 1. Monitored wind turbine and anemometers used.
111
112
Details of the components of the LCM framework have been previously described by
113
Smarsly et al. (2012a,b,c, 2013a,b) and Hartmann et al. (2011). This long term life cycle
114
management system provides valuable information that can be used not only to monitor
115
the integrity of the wind turbine structure, but also to support new research and to test
116
algorithmic developments.
117
applicability of machine learning algorithms for analyzing the relationships between wind
118
turbine response and wind field characteristics. The following briefly discusses the two
119
basic components that are related to the collection and management of the measurement
120
data, namely, the “SHM System” and the “Decentralized Software System”. We then
121
discuss the data sets employed in this study.
Using the monitoring data, this study investigates the
6
122
123
Figure 2.Architecture of the LCM framework.
124
125
Structural health monitoring system
126
127
A SHM system is developed and installed inside and outside of the zinc-coated steel
128
tower as well as at the concrete foundation of the wind turbine structure (Lachmann et al.
129
2009). The SHM system consists of a network of sensors, data acquisition units (DAUs)
130
and an on-site server installed in the maintenance room of the wind turbine. Six three-
131
dimensional accelerometers, type PCB-3713D1FD3G, are mounted on five different
132
levels on the inner surface of the wind turbine tower. The sensitivity of the
133
accelerometers is 700 mV/g with a measurement range of ± 3.0 g and a frequency range
134
between 0 and 100 Hz. Three additional accelerometers are placed at the foundation of
135
the wind turbine. Since very small accelerations are expected at the foundation, single-
7
136
axis piezoelectric seismic ICP accelerometers of type PCB-393B12 with a sensitivity of
137
10,000 mV/g are used; their measurement range amounts to ± 0.5 g. Furthermore, six
138
inductive displacement transducers HBM-W5K, having a range of 5 mm, are mounted at
139
21 m and 42 m height, inside the tower. To determine temperature effects on the
140
displacement measurements, the inductive displacement transducers are complemented
141
by RTD surface sensors Pt100 capable of measuring temperature from –60 °C to +200 °C.
142
Additional temperature sensors are placed inside and outside the tower to gauge
143
temperature gradients. Last but not least, two anemometers are deployed. The first one, a
144
cup anemometer that is directly connected to the integrated supervisory control and data
145
acquisition (SCADA) system of the wind turbine, is installed on top of the nacelle at 67
146
m height (Figure 1). The second one, a three-dimensional ultrasonic anemometer, is
147
mounted on a telescopic mast adjacent to the wind turbine at 13 m height; it continuously
148
monitors the horizontal and vertical wind speed (0...60 m/s), the wind direction (0...360°)
149
and the air temperature (–40...60°C).
150
151
The sensors are controlled by the DAUs that are connected to the on-site server located in
152
the maintenance room of the wind turbine. For the acquisition of temperature data, three
153
4-channel Picotech RTD input modules PT-104 are deployed. For the acquisition of
154
acceleration and displacement data, four Spider8 measuring units are used. Each Spider8
155
unit has separate A/D converters ensuring simultaneous measurements at sampling rates
156
between 1 Hz and 9,600 Hz. All data sets, being sampled and digitized, are continuously
157
forwarded from the DAUs to the on-site server for temporary storage. In addition to the
158
structural and environmental sensor data collected by the DAUs, operational wind turbine
8
159
data, such as power production and rotational speed, is taken directly from the wind
160
turbine’s internal SCADA system.
161
162
All recorded data sets, referred to as “primary monitoring data”, are forwarded
163
continuously from the data acquisition units to the on-site server for temporary storage
164
and periodic backups. Using a permanently installed DSL connection, the on-site server
165
transfers the primary monitoring data to the decentralized software system of the LCM
166
framework.
167
168
Decentralized software system
169
170
Installed at different computers at ICE in Bochum, the decentralized software system
171
serves two basic purposes. First, it provides a persistent storage for the data recorded
172
from the wind turbine supporting automated data management and processing. Second, it
173
includes a set of program modules enabling remote access to the LCM framework.
174
175
The collected primary monitoring data (or “raw data”) is transferred from the on-site
176
server to the main server, and then, to the MySQL central monitoring database system
177
both installed at the ICE. The data transmission is automatically executed by a time-based
178
UNIX utility running on the on-site server that ensures the periodic execution of tasks
179
according to prescribed scheduled time intervals. On the main server at ICE, the primary
180
monitoring data is converted into “secondary monitoring data” that can easily be
181
interpreted by human users. The primary monitoring data is synchronized and aggregated.
9
182
Metadata is added to provide specifications of the installed sensors, the IDs of data
183
acquisition units, output specification details, date and time formats, etc. Furthermore,
184
“tertiary monitoring data”, summarizing the basic statistics of the data sets, such as
185
quartiles, medians and means, is computed at given time intervals. The secondary and
186
tertiary monitoring data is backed up in a RAID-based storage system and then stored in
187
the monitoring database.
188
189
Once the data is successfully converted and persistently stored, an acknowledgement is
190
sent from the main server at ICE to the on-site server in the wind turbine, whereupon the
191
respective primary monitoring data on the on-site server is deleted. In order to avoid
192
inconsistencies, all data sets involved are automatically “locked” during the conversion
193
process to prohibit unacceptable accessed by external software programs or by human
194
users. The data conversion process described is implemented using the commercial tool
195
“PDI” (Pentaho Data Integration), which offers metadata-driven conversion and data
196
extraction capabilities (Roldan 2009, Castors 2008). Descriptions of performance tests
197
validating the automated data conversion process have been documented by Smarsly and
198
Hartmann (2009a, 2009b, 2010).
199
200
Remote access to the monitoring data is provided in two ways, through a web interface
201
and a direct database connection. In particular, the web interface, installed at ICE in
202
Bochum, offers problem dependent GUIs for remotely visualizing, exporting and
203
analyzing the monitoring data, while the database connection allows accessing the data
204
directly. The web interface is implemented as a web service written in Java; it is based on
10
205
Adobe Flex, an open-source framework for developing Rich Internet Applications (RIAs),
206
using the Adobe Flash platform (Adobe 2007, Polanco and Pedersen 2009). Figure 3
207
shows the web interface exemplarily displaying the monitoring data collected over 4
208
hours, on October 1, 2013. Using the control panel on the right hand side and the menu
209
bar on top of the GUI, the user can select the data collected by specific sensors, specify
210
the time intervals to be plotted, conduct online data analyses, and export data sets. Figure
211
3 displays a time history of horizontal acceleration data collected by a Spider8 measuring
212
unit through a 3D accelerometer that is installed at 63 m height of the wind turbine tower.
213
In addition to the acceleration data, a time history of horizontal wind speed (labeled U67),
214
collected by the cup anemometer installed on top of the wind turbine nacelle at 67 m
215
height, is shown.
216
217
218
Figure 3. Web interface providing remote access to the LCM framework.
219
11
220
Data sets used in this study
221
222
The machine learning approaches to be discussed employ various inputs (i.e. wind field
223
statistics) and the corresponding outputs (i.e. structural response data) in order to
224
understand the wind characteristics and to make reliable predictions on the structural
225
wind turbine behavior, by “learning” from the existing input-output patterns. As input,
226
long-term wind field statistics computed from the monitoring data recorded by the
227
anemometers described above are used in this study. Specifically, the input feature vector
228
Μ…67 , π‘ˆ
Μ…13 , (𝑄3 − 𝑄1 )13 ) consists of three statistical features, where
𝒙 = (π‘₯1 , π‘₯2 , π‘₯3 ) = (π‘ˆ
229
Μ…67 is the mean of a 20-minute wind speed time series recorded at 67 m height by the cup
π‘ˆ
230
Μ…13 is the mean of a 20-minute wind speed time series obtained at 13 m
anemometer, π‘ˆ
231
height by the ultrasonic anemometer, andο€ (𝑄3 − 𝑄1 )13ο€ is the interquartile, the difference
232
between the 75 percentile and the 25 percentile for the data distribution over the 20-
233
minute wind speed time series recorded at the 13 m height. Since the monitoring data
234
stored in the database system includes only the statistical mean and quartile data for each
235
20-minute period, the interquartile is used in this study as a measure to quantify the level
236
of turbulence.
237
238
For the output response, the acceleration of the tower recorded by one of the
239
accelerometers located at 21 m height is used. Specifically, the interquartile values for the
240
20-minute acceleration time series are employed as an indirect measure for the averaged
241
magnitude of the wind turbine tower acceleration. Assuming the acceleration time series
242
are of zero or constant mean values, the interquartile value is a reasonable measure
12
243
analogous to the RMS value of the acceleration time series, which is commonly used for
244
quantifying the averaged magnitude of acceleration response of on a structure. For
245
classification, the interquartile values are evenly divided and categorized into 𝑁𝐢
246
discretized classes based on their magnitude; a response class is denoted by 𝑦𝐢 , 𝑦𝐢 ∈
247
{1, … , 𝑁𝐢 }. In total, 7,960 pairs of 𝒙 (input feature vector) and 𝑦𝐢 (the interquartile output
248
class), provided by the LCM framework, serve as the basis for this study.
249
250
CHARACTERIZATION OF WIND FIELDS
251
252
Gaussian mixture models (GMMs) can be regarded as a statistical unsupervised learning
253
to extract the underlying hidden structure by estimating the probability density of features
254
from a statistical data set. Due to their capability of representing different probability
255
distributions, GMMs have been widely used in computer vision and speaker recognition
256
(Stauffer and Grimson 1999; Reynolds et al., 2000). GMMs have also been applied to
257
describe the stochastic load variations in a distributed energy grid (Singh et al., 2010;
258
Gonzalez-Longatt et al., 2012). In this study, GMMs are used to study the wind field
259
characteristics for different time periods of a day by constructing the multivariate
260
probability density function based on diurnal and nocturnal wind statistics.
261
262
Gaussian Mixture Models
263
264
A Gaussian mixture model (GMM) can be regarded as a multivariate probability density
265
function written in terms of the weighted sum of the Gaussian probability density
13
266
functions. Due to the combination of multiple Gaussian functions, a GMM can represent
267
a complex multi-modal probability density function. For the statistical wind input feature
268
Μ…67 , π‘ˆ
Μ…13 , (𝑄3 − 𝑄1 )13 ) , the multivariate PDF 𝑓𝑋 (𝒙) is
vector 𝒙 = (π‘₯1 , π‘₯2 , π‘₯3 ) = (π‘ˆ
269
expressed in terms of a linear combination of 𝐾 probability density functions:
𝐾
𝑓𝑋 (𝒙) = ∑ πœ™π‘— 𝑝𝑗 (𝒙)
(1)
𝑗=1
270
where 𝑝𝑗 (𝒙) is the Gaussian probability density fucntion (GPDF) for the 𝑗th mixture
271
component of the GMM and πœ™π‘— , ∑𝐾
𝑗=1 πœ™π‘— = 1, is the corresponding mixture weight. The
272
𝑗th GPDF 𝑝𝑗 (𝒙) can be fully described by its mean vector 𝝁𝑗 and covariance matrix πšΊπ‘—
273
(Reynolds et al., 2000):
𝑝𝑗 (𝒙) =
1
1
𝑇
exp⁑(− (𝒙 − 𝝁𝑗 ) πšΊπ‘—−1 (𝒙 − 𝝁𝑗 ))
2
√(2πœ‹)𝑛 |πšΊπ‘— |
(2)
274
where 𝑛 is the dimension of input feature vector 𝒙. To construct the PDF 𝑓𝑋 (𝒙), the
275
parameters 𝝓 = {πœ™1 , … , πœ™πΎ }, 𝝁 = {𝝁1 , … , 𝝁𝐾 } and 𝚺 = {𝚺1 , … , 𝚺𝐾 } are determined from
276
(𝑖) Μ… (𝑖)
(𝑖)
Μ…67
the measurement (training) data set 𝒙(𝑖) = ⁑ (π‘ˆ
, π‘ˆ13 , (𝑄3 − 𝑄1 )13 ), 𝑖 = 1, … , π‘šπ‘’ , where
277
𝒙(𝑖) is the 𝑖th input feature vector and π‘šπ‘’ denotes the size (number of input feature
278
vectors) of the training data set.
279
Assuming the input vectors 𝒙(𝑖) , 𝑖 = 1, … , π‘šπ‘’ , in the training data set are independent,
280
the parameters 𝝓, 𝝁⁑and⁑𝚺 can be estimated as the values that maximize the following
281
log-likelihood estimation from the measured wind input feature data as:
14
π‘šπ‘’
𝑙(𝝓, 𝝁, 𝚺) = ln ∏ 𝑝( 𝒙(𝑖) ; ⁑𝝓, 𝝁, 𝚺)
𝑖=1
π‘šπ‘’
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑= ∑ ln 𝑝(𝒙(𝑖) ; ⁑𝝓, 𝝁, 𝚺) ⁑
𝑖=1
π‘šπ‘’
(3)
⁑𝐾
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑= ∑ ln ( ∑ 𝑝⁑(𝒙(𝑖) , 𝑧 (𝑖) ; 𝝁, 𝚺, 𝝓))
𝑧 (𝑖) =1
⁑𝐾
𝑖=1
π‘šπ‘’
282
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑= ∑ ln ( ∑ 𝑝(𝒙(𝑖) |𝑧 (𝑖) ; 𝝁, 𝚺)𝑝(𝑧 (𝑖) ; 𝝓))
𝑧 (𝑖) =1
𝑖=1
283
284
where 𝑝(𝒙(𝑖) ; 𝝓, 𝝁, 𝚺) is the probability of the input feature 𝒙(𝑖) given the parameters,
285
𝝓, 𝝁⁑and⁑𝚺. The latent random variable 𝑧 (𝑖) specifies one of the 𝐾 possible Gaussian
286
probability distributions from which 𝒙(𝑖) is drawn. Because 𝑧 (𝑖) is a random vairable
287
individually assigned to each 𝒙(𝑖) , it follows different probablity distributions over the 𝐾
288
possible GPDFs. If 𝑧 (𝑖) is explicitly known apriori, Eq. (3) can be expressed in terms of
289
the parameters of GMM and optimized with repsect to the parameters.
290
291
Since 𝑧 (𝑖) is unknown, Eq. (3) cannot be explicitly defined; therefore, the parameters of
292
the GMM are difficult to estimate. In this study, an iterative expectation maximization
293
(EM) algorithm is employed to find the maximum likelihood estimates of the statistical
294
paraemters (Render and Walker, 1984). In EM, the lower bound of the log-likelihood
295
function 𝑙(𝝓, 𝝁, 𝚺) in Eq. (3) is represented as:
296
15
π‘šπ‘’
𝐾⁑
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑𝑙(𝝓, 𝝁, 𝚺) = ∑ ln ∑ 𝑝 (𝒙(𝑖) , 𝑧 (𝑖) ; ⁑𝝓, 𝝁, 𝚺)
𝑖=1
𝑧 (𝑖) =1
π‘šπ‘’
𝐾⁑
𝑝(𝒙(𝑖) , 𝑧 (𝑖) ; ⁑𝝓, 𝝁, 𝚺)
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑= ∑ ln ∑ π‘žπ‘– (𝑧 )
π‘žπ‘– (𝑧 (𝑖) )
(𝑖)
(𝑖)
𝑖=1
𝑧
π‘šπ‘’
𝐾⁑
(4)
=1
𝑝(𝒙(𝑖) , 𝑧 (𝑖) ; ⁑𝝓, 𝝁, 𝚺)
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑≥ ∑ ∑ π‘žπ‘– (𝑧 ) ln
π‘žπ‘– (𝑧 (𝑖) )
(𝑖)
(𝑖)
𝑖=1 𝑧
=1
297
where π‘žπ‘– (𝑧 (𝑖) ), ∑𝐾⁑
π‘ž (𝑧 (𝑖) ) = 1, denotes the probability that the 𝑖th input feature 𝒙(𝑖)
𝑧 (𝑖) =1 𝑖
298
is drawn from the 𝑧 (𝑖) th GPDF among the 𝐾 GPDFs. If we let π‘Œ(𝑧 (𝑖) ) =
299
the inequality in Eq.(4) is always true for any probability distribution π‘žπ‘– (𝑧 (𝑖) ) because, by
300
the Jensen’s inequaltiy, ln E[π‘Œ(𝑧 (𝑖) )] ≥ E[ln π‘Œ(𝑧 (𝑖) )] (Boyd and Vandenberghe, 2004).
𝑝(𝒙(𝑖) ,𝑧 (𝑖) ;⁑𝝓,𝝁,𝚺)
,
π‘žπ‘– (𝑧 (𝑖) )
301
302
The EM algorithm consists of two iterative steps. For the “E-step” the probability
303
distribution π‘žπ‘– (𝑧 (𝑖) = 𝑗)⁑for⁑𝑗 = 1, … , 𝐾 is computed (i.e., the GPDF from which 𝒙(𝑖) is
304
drawn) using the current feature vector 𝒙(𝑖) and the parameters, 𝝓, 𝝁⁑and⁑𝚺, estimated in
305
the preceding (M-step):
π‘žπ‘– (𝑧 (𝑖) = 𝑗) = 𝑝(𝑧 (𝑖) = 𝑗|𝒙(𝑖) ; 𝝓, 𝝁, 𝚺)
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑=
𝑝(𝒙(𝑖) |𝑧 (𝑖) = 𝑗; 𝝁, 𝚺)𝑝(𝑧 (𝑖) = 𝑗; 𝝓)
𝑝(𝒙(𝑖) ; 𝝓, 𝝁, 𝚺)
(5)
306
Eq. (5) can be easily evaluated using Eq. (2) because here we assume 𝑧 (𝑖) = 𝑗. The “M-
307
step” then updates the parameters, 𝝓, 𝝁⁑and⁑𝚺, that maximize the lower bound of the log-
308
likelihood function in Eq. (4) by using π‘žπ‘– (𝑧 (𝑖) ) estimated in the preceding E-step. The E-
16
309
and M-steps are repeated until the parameters, 𝝓, 𝝁⁑and⁑𝚺, converge to some acceptable
310
values.
311
312
Application to wind fields
313
314
To study the diurnal and nocturnal wind field characteristics at the wind turbine site, we
315
construct the PDFs from the wind data collected from the anemometers. In this study, we
316
set 𝑑1 = 6: 00⁑AM⁑~⁑6: 00⁑PM (daytime) and 𝑑2 = 6: 00⁑~⁑to⁑6: 00⁑PM (night time). A
317
total number of 7,960 input feature vectors, measured for a period of 110 days, are
318
categorized into the two time periods 𝑑1 and 𝑑2 . Two data sets (each with 3,980 feature
319
vectors) are then used to construct the time specific PDFs, 𝑓𝑋 (𝒙|𝑑1 ) and 𝑓𝑋 (𝒙|𝑑2 ). The
320
wind input feature data for the two time periods are plotted as shown in Figure 4. It can
321
be seen that the wind field characteristics for the two time periods are quite different
322
because of the temperature gradients with respect to the elevation; the wind speeds and
323
the levels of turbulence are usually higher during the day than during the night due to
324
more active airflow mixing (Stull, 1988).
17
Μ…67 and π‘ˆ
Μ…13
Figure 4. Wind data collected during day (circle) and night (cross) times. π‘ˆ
denote the mean of a 20-minute wind speed time series data at 67 m height and at 13 m
height, respectively.⁑(𝑄3 − 𝑄1 )13 denotes the interquartile of a 20-minute wind speed
time series at 13 m height.
325
Figure 5 shows the PDFs 𝑓𝑋 (𝒙|𝑑1 ) and 𝑓𝑋 (𝒙|𝑑2 ) constructed by applying the GMM to the
326
diurnal and the nocturnal wind data sets, respectively. In this study, 3 GPDFs are used to
327
construct the PDFs 𝑓𝑋 (𝒙|𝑑1 ) and 𝑓𝑋 (𝒙|𝑑2 ). The number of GPDFs is determined such that
328
the Akaike information criterion (AIC) (Akaike, 1974), that quantifies the tradeoff
329
between the complexity of a statistical model and the fitness of the statistical model to
330
data, is maximized. Figure 5 shows the shapes and the densities of the PDFs, which
331
depict the different characteristics for the wind fields for the two time periods. In Figure 5,
332
the regions with darker color correspond to the regions with lots of measurement data
333
points in Figure 4. The constructed PDFs effectively describe the density of the wind
334
input feature data and the correlations among the different wind input features using only
335
a small number of GMM parameters.
18
(a) 𝑓𝑋 (𝒙|𝑑1 ) (daytime)
(b) 𝑓𝑋 (𝒙|𝑑2 ) (night time)
Figure 5. Multivariate probability density functions for the wind field input features.
336
337
With the multivariate probability density function 𝑓𝑋 (𝒙) for the three wind field input
338
Μ…67 , π‘ˆ
Μ…13 and (𝑄3 − 𝑄1 )13 , three marginal PDFs, 𝑓(π‘ˆ
Μ…67 , π‘ˆ
Μ…13 ) , 𝑓(π‘ˆ
Μ…13 , (𝑄3 −
features, π‘ˆ
339
Μ…67 ) , can be computed in that 𝑓𝑋 (𝒙\π‘₯π‘˜ ) is obtained by
𝑄1 )13 ) and 𝑓((𝑄3 − 𝑄1 )13 , π‘ˆ
340
integrating 𝑓𝑋 (𝒙) with respect to the feature variable π‘₯π‘˜ . Figure 6 shows the marginal
341
PDFs computed from the 𝑓𝑋 (𝒙|𝑑1 ) and 𝑓𝑋 (𝒙|𝑑2 ). Based on the marginal PDFs, we can
342
study more clearly about the relationships between each pair of wind statistical
343
parameters. It can be seen that positive correlations exist between the features, but the
344
levels of correlation differ slightly between the day and the night times.
345
19
(a) Diurnal
(b) Nocturnal
Figure 6. Marginal PDFs showing the characteristics for the diurnal and nocturnal wind
fields. The contours represent PDFs constructed from GMM applied to the input data,
represented as dot.
20
346
Analogously, we can obtain an univariate PDF 𝑓𝑋𝑖 (π‘₯𝑖 ) for a specific feature variable by
347
integrating the 𝑓𝑋 (𝒙) with respect to all other feature variables. The probability mass
348
function for a feature variable can then be computed by integrating the univariate PDF
349
over a discrete interval. Using the discretized intervals of 14/30 m/s, 8/30 m/s and 4/30
350
m/s, which divide the range of input features into 30 bins, respectively, the discrete
351
Μ…67 ) , 𝑃(π‘ˆ
Μ…13 ) and 𝑃((𝑄3 − 𝑄1 )13 ) , are estimated and
probability mass functions, 𝑃(π‘ˆ
352
plotted as continuous line in Figure 7. That is, the univariate PDFs are generated from the
353
multivariate PDF constructed by applying GMMs to the wind feature training data set.
354
Comparing the discrete PMF computed by counting the frequency of occurrence directly
355
from the wind data (shown as bar in Figure 7), we can observe that the statistical trends
356
predicted from the GMM are in good agreement with those described by the data.
357
358
The univariate PDFs can also be derived individually by applying GMMs to each wind
359
feature. As shown in Figure 7, the PMF obtained from the multivariate PDF (3-D GMM)
360
and the PMF from the independent univariate PDF (1-D GMM) are in good agreement.
361
This result indicates that the high dimensional PDFs constructed appear to preserve the
362
statistical trend in the wind input features. Furthermore, the multivariate PDFs can
363
provide valuable information about the correlation among the wind field features.
364
21
(a) Diurnal
(b) Nocturnal
Figure 7. PMFs showing the characteristics for the diurnal and nocturnal wind fields.
365
366
WIND TURBINE LOAD CLASSIFICATION
367
368
Gaussian discriminant analysis (GDA), first proposed by Fisher (1936) as statistical
369
classification analysis, has been widely adopted as a supervised learning algorithm to
370
extract the underlying relationships between the input (feature vectors) and the output
371
(classes) data. In practice, if the input features for each output class follow the Gaussian
22
372
distribution, GDA requires the least number of training data to find the parameters for a
373
statistical classification model and has better classification accuracy compared to other
374
classification algorithms that map the input features directly to the output classes (Pohar
375
et al. 2004). In this study, GDA is used to study the relationships between the wind input
376
features and the tower acceleration response of the wind turbine structure.
377
378
Gaussian Discriminant Analysis
379
380
The Gaussian discriminant analysis constructs the response classification (prediction)
381
Μ…67 , π‘ˆ
Μ…13 , (𝑄3 − 𝑄1 )13 ), as defined
function 𝑔𝐢 (𝒙) that maps the wind input features 𝒙 = (π‘ˆ
382
previously, to the output response class 𝑦𝐢 ∈ {1, … , 𝑁𝐢 }, where 𝑁𝐢 is the number of
383
classes that categorize the different levels of output response feature. In this study, the
384
output response feature is the interquartile values (i.e. the difference between the 75
385
percentile and the 25 percentile) for a 20 minute acceleration time series recorded by an
386
accelerometer at a 21 m height inside the wind turbine tower. Analogous to the RMS
387
measure, the interquartile is used as measure to quantify the average magnitude of the
388
tower acceleration. Each interquartile data value is then assigned to an output response
389
class 𝑦𝐢 .
390
(𝑖)
391
Using π‘šπ‘  pairs of input and output (training) data, (𝒙(𝑖) , 𝑦𝐢 ), 𝑖 = 1, … , π‘šπ‘  , GDA
392
constructs the probability density function 𝑝(𝒙|𝑦𝐢 ) for the input feature vector 𝒙
393
conditional on the given output response class 𝑦𝐢 . Assuming multivariate Gaussian
23
394
distribution with mean vector 𝝁𝑗 and covariance matrix πšΊπ‘— , the PDF for the 𝑗th output
395
class 𝑝(𝒙|𝑦𝐢 = 𝑗), 𝑗 = 1, … , 𝑁𝐢 , can be represented as:
396
𝑝(𝒙|𝑦𝐢 = 𝑗) =
1
√(2πœ‹)𝑛 |πšΊπ‘— |
1
𝑇
exp⁑(− (𝒙 − 𝝁𝑗 ) πšΊπ‘—−1 (𝒙 − 𝝁𝑗 ))
2
(6)
397
398
where 𝑛 is the number of features in 𝒙, i.e. the dimension of 𝒙. Furthermore, the prior
399
distribution on the response class 𝑦𝐢 is expressed as a multinomial distribution, i.e.
400
𝑝(𝑦𝐢 = 𝑗; ⁑𝝋) = πœ‘π‘— , where πœ‘π‘— is simply the ratio between the number of data points for
401
class 𝑦𝐢 = 𝑗.
402
403
To obtain the mapping function 𝑔𝐢 (𝒙) between the input and output features, we
404
construct 𝑝(𝒙|𝑦𝐢 ) and 𝑝(𝑦𝐢 ) by estimating the parameters, 𝝁 = {𝝁1 , … , 𝝁𝑁𝐢 } , 𝚺 =
405
{𝚺1 , … , πšΊπ‘πΆ } and 𝝋 = {πœ‘1 , … , πœ‘π‘πΆ }, for the 𝑁𝐢 conditional distributions and the priors.
406
The parameters 𝝁, 𝚺 and 𝝋 can be determined by maximizing the log-likelihood function
407
over the π‘šπ‘  pairs of input and output (training) data sets (Li et al., 2006):
π‘šπ‘ 
(𝑖)
𝑙(𝝋, 𝝁, 𝚺) = ∑ ln⁑𝑝( 𝒙(𝑖) , 𝑦𝐢 ; 𝝋, 𝝁, 𝚺)
𝑖=1
(7)
π‘šπ‘ 
(𝑖)
(𝑖)
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑= ∑ ln ⁑𝑝(𝒙(𝑖) |𝑦𝐢 ; 𝝁, 𝚺)𝑝(𝑦𝐢 ; 𝝋)
𝑖=1
408
24
(𝑖)
409
The response class 𝑦𝐢 denotes the specific PDF from which π‘₯ (𝑖) is drawn. As the class
410
𝑦𝐢 is explicitly given from the training data, Eq. (7) can be expressed in terms of the
411
previously defined 𝑝(𝒙|𝑦𝐢 ) and 𝑝(𝑦𝐢 ). Note that the log-likelihood function in Eq. 7 is
412
concave (because of the sum of logarithmic functions), the function can be easily
413
optimized with respect to the parameters 𝝁, 𝚺, and 𝝋.
(𝑖)
414
415
Once the parameters⁑𝝁, 𝚺, and 𝝋, for 𝑝(𝒙|𝑦𝐢 ) and 𝑝(𝑦𝐢 ) are determined, the posterior
416
distribution on the response class 𝑦𝐢 given the wind input feature 𝒙 is modeled according
417
to the Bayes’ rule as follows:
418
𝑝(𝑦𝐢 |𝒙) =
𝑝(𝒙|𝑦𝐢 )𝑝(𝑦𝐢 )
𝑝(𝒙|𝑦𝐢 )𝑝(𝑦𝐢 )
= ⁑
∑𝑦𝐢 𝑝(𝒙|𝑦𝐢 )𝑝(𝑦𝐢 )
𝑝(𝒙)
(8)
419
Therefore, for the new wind input feature vector 𝒙𝑛𝑒𝑀 representing the newly observed
420
wind field characteristics, the corresponding estimated response class 𝑦̂𝐢 ∈ {1, … , 𝑁𝐢 }
421
can be determined according to the maximum a posteriori detection (MAP) principle
422
expressed as:
423
𝑦̂𝐢 = 𝑔𝐢 (𝒙𝑛𝑒𝑀 ) ≝ argmax 𝑝(𝑦𝐢 |𝒙𝑛𝑒𝑀 )
𝑦𝐢
(9)
⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑⁑= argmax 𝑝(𝒙𝑛𝑒𝑀 |𝑦𝐢 )𝑝(𝑦𝐢 )
𝑦𝐢
424
425
The response classification function 𝑔𝐢 (𝒙) divides the domain of the input feature 𝒙 (in a
426
3D space) into 𝑁𝐢 regions, each of which corresponds to a different output response class.
25
427
We can construct the boundary between the two neighboring classes 𝑖 and 𝑗 by setting
428
𝑝(𝑦𝐢 = 𝑖|𝒙) = 𝑝(𝑦𝐢 = 𝑗|𝒙) in Eq. (6), which leads to a quadratic equation in terms of
429
the input feature 𝒙 as follows (Hastie et al. 2008):
𝒙𝑇 (πšΊπ‘–−1 − πšΊπ‘—−1 )𝒙 − 2(𝝁𝑻𝑖 πšΊπ‘–−1 − 𝝁𝑻𝑗 πšΊπ‘—−1 )𝒙 + 𝝁𝑻𝑖 πšΊπ‘–−1 𝝁𝑖 − 𝝁𝑻𝑗 πšΊπ‘—−1 𝝁𝑗 + ln
|πšΊπ‘– |
|πšΊπ‘— |
=0
(10)
430
431
Eq. (10) represents a quadratic surface that separates two neighboring regions (classes).
432
Note that if the covariance matrices are assumed to be the same for all the classes, i.e.
433
πšΊπ‘– = 𝚺⁑ for⁑𝑖 = 1, … , 𝑁𝐢 , Eq. (10) becomes a linear plane surface and the problem is
434
generally known as linear discriminant analysis.
435
436
Application to wind fields and wind turbine responses
437
438
The size of the training data set, π‘šπ‘  , and the number of output response classes, 𝑁𝐢 , can
439
affect the accuracies of the response classification function 𝑔𝐢 (𝒙) shown in Eq. (9).
440
Figure 8 shows the accuracy ratios of 𝑔𝐢 (𝒙) with different number of response classes
441
and different size of the training data set. The accuracy ratio is calculated as the ratio
442
between the number of correct predictions by 𝑔𝐢 (𝒙) and the size of the test data set, π‘šπ‘‘ .
443
Altogether, 1,000 pairs ( π‘šπ‘‘ = 1,000 ) of the input feature vector 𝒙 and the output
444
response class 𝑦𝐢⁑ , which do not belong to the training data set, are used to calculate the
445
accuracy ratio for the classification by 𝑔𝐢 (𝒙). Furthermore, to define different number of
446
classes, the interquartile values are evenly divided and categorized into 𝑁𝐢 discretized
447
classes based on their magnitudes, which range from 0 to 19 mg.
26
448
449
Form Figure 8, it can be seen that the accuracy ratio decreases as the number of the
450
response classes increases. On the other hand, for a given number of response classes, the
451
accuracy ratio rarely changes with the size of the training data set. As the size of training
452
data set exceeds the minimum number of data points necessary to estimate the covariance
453
matrix for each class, the accuracy ratio seems to remain the same, which indicates the
454
effectiveness of GDA in terms of the training data size. With the input features employed
455
in this study, we select 5 output response classes (𝑁𝐢 = 5) that show reasonably high
456
accuracy ratios (78 ~ 80 %) from the test data set. The ranges of the interquartile values
457
for each class are shown in Table 1.
Figure 8. Comparison of the accuracy ratios for the response classification functions
using different numbers of classes and different sizes of training data sets.
458
459
Table 1. The ranges of interquartile value for each output response class.
Class 𝑦𝐢
Interquartile
range
1
0.0 ~ 3.8
(mg)
2
3.8 ~ 7.6
(mg)
3
7.6 ~ 11.4
(mg)
460
27
4
11.4 ~ 15.2
(mg)
5
15.2 ~ 19.0
(mg)
461
For a total of 7,960 pairs input feature vectors and output response data collected over
462
110 days, we randomly select 3,980 pairs as the training data set. Figure 9 shows the data
463
points for the input feature vectors and the corresponding output response classes. The
464
boundary surfaces discriminating two neighboring classes as determined by Eq. (10) are
465
also plotted in Figure 9. It can be seen that the quadratic boundary surfaces discriminate
466
the response classes reasonably well.
467
468
469
470
471
472
473
474
Figure 9. Training data for GDA and the constructed boundaries classifying the response
classes.
EXPECTED WIND TURBINE LOADS AND CLASS PROBABILITIES
475
476
The diurnal and nocturnal wind field characteristics are investigated by constructing
477
multivariate probability density functions via GMMs, while the influences of the wind
478
fields on the wind turbine response are studied by constructing classification functions
28
479
with the aid of GDA. Combining these two independently constructed models, we can
480
establish insightful information by establishing a probabilistic framework to study the
481
response of the wind turbine during the day and the night times. Specifically, we compare
482
the class probabilities and the expected classes of the wind turbine tower responses
483
corresponding to the diurnal and the nocturnal winds.
484
485
Evaluation for the Class Probabilities and the Expected Class
486
487
Given the load classification function and the probability density function for wind
488
statistical parameters, the class probabilities and the expected response class can be
489
obtained, without requiring further measurements of the wind turbine responses from
490
sensors. The probability of the wind turbine response class 𝑖⁑conditional on the specific
491
time period 𝑑𝑗 can be estimated by integrating the time-specific probability density
492
function 𝑓𝑋 (𝒙|𝑑𝑗 ) over the input feature domain {𝒙|𝑔𝐢 (𝒙) = 𝑖} corresponding to the wind
493
turbine response class 𝑖 as:
⁑
𝑝(𝑦𝐢 = 𝑖|𝑑𝑗 ) ≈ 𝑝(𝑦̂𝐢 = 𝑖|𝑑𝑗 ) = ∫
𝑓𝑋 (𝒙|𝑑𝑗 )𝑑𝒙
{𝒙|𝑔𝐢 (𝒙) = 𝑖 }
(11)
494
495
As mentioned before, 𝑦𝐢 is the true wind turbine response class and 𝑦̂𝐢 is the estimated
496
response class by the load classification function 𝑔𝐢 (𝒙), i.e., 𝑦̂𝐢 = 𝑔𝐢 (𝒙). The response
497
class probability 𝑝(𝑦𝐢 |𝑑𝑗 ) can provide how the wind turbine response is distributed
498
during the time period 𝑑𝑗 . The expected wind turbine response class conditional on the
499
time period 𝑑𝑗 can be determined by integrating the response classification function
29
500
𝑔𝐢 (𝒙) weighted by the time specific probability density function 𝑓𝑋 (𝒙|𝑑𝑗 ) over the entire
501
input feature domain as:
⁑
E[𝑦𝐢 |𝑑𝑗 ] ≈ E[𝑦̂𝐢 |𝑑𝑗 ] = ∫ 𝑔𝐢 (𝒙)𝑓𝑋 (𝒙|𝑑𝑗 )𝑑𝒙
(12)
𝒙
502
503
The expected wind turbine response can provide general insight into the difference in the
504
wind turbine response for the time period considered.
505
506
Application to wind fields and wind turbine responses
507
508
To study the wind turbine responses corresponding to the diurnal and the nocturnal wind
509
fields, we construct the class probabilities 𝑃(𝑦𝐢 = 𝑖|𝑑𝑗 ) and the expected response classes
510
E[𝑦𝐢 |𝑑𝑗 ] for the wind turbine tower acceleration during the daytime 𝑑1 and during the
511
nighttime 𝑑2 . Figure 10 shows 𝑓𝑋 (𝒙|𝑑1 ) and 𝑓𝑋 (𝒙|𝑑2 ) overlapped with the load
512
classification function⁑𝑔𝐢 (𝒙), which divides the input feature domain into five different
513
classes. For display purposes, the PDFs are represented by the contour surfaces with
514
𝑓𝑋 (𝒙|𝑑𝑗 ) = 0.02. The probability of the class 𝑖 can be estimated by integrating the PDF
515
intersected with the input feature domain corresponding to the class 𝑖 using Eq. (11). As
516
shown in Figure 10, the PDF for the daytime wind disperses more widely in the region
517
associated with the high load classes (𝑦𝐢 = 3, 4⁑and⁑5) than the PDF for the nighttime
518
wind. As a result, the diurnal wind fields have higher probabilities for classes 3, 4 and 5
519
but lower class probabilities for classes 1 and 2 than the nocturnal wind fields as shown
520
in Figure 11.
30
(a) Day
(b) Night
Figure 10. PDF 𝑓𝑋 (𝒙|𝑑𝑗 ) and the response classification function ⁑𝑔𝐢 (𝒙) for evaluating
the class probabilities and the expected response classes.
521
Figure 11. Comparison of the estimated class probabilities for the day and the night
times.
522
523
Using the available data, the true class probabilities and the expected classes
524
corresponding to the diurnal and the nocturnal wind fields can be directly calculated by
525
counting the number of measurement data points in each response class for the day and
526
night time periods. Table 2 compares the class probabilities and the expected classes
527
obtained from the measurement data and those estimated from 𝑓𝑋 (𝒙|𝑑𝑗 ) and 𝑔𝐢 (𝒙). In
528
general, the estimated class probabilities compare very well with the true (measured)
31
529
class probabilities with only slight deviation observed for the nocturnal case. The
530
estimated expected class also matches very well with the expected classes obtained from
531
the measurement data. It is worth noting that while accurately predicting the statistical
532
trend in the wind turbine output response, the machine learning approach can also
533
elucidate the reason for the statistical trend in the wind turbine output response by
534
relating 𝑓𝑋 (𝒙|𝑑𝑗 ) and 𝑔𝐢 (𝒙). That is, the daytime wind fields induce the higher level of
535
the wind turbine tower acceleration due to the higher mean wind speed and turbulence
536
compared to the night time wind fields.
Table 2. Comparison of the class probabilities and the expected loads.
537
𝑑1
(day)
𝑑2
(night)
Measured
Estimated
Measured
Estimated
Class 1
Class 2
Class 3
Class 4
Class 5
0.3835
0.3832
0.5202
0.4916
0.2901
0.2913
0.2807
0.3319
0.1849
0.1845
0.1468
0.1160
0.0799
0.0766
0.0279
0.0347
0.0616
0.0645
0.0244
0.0257
Expected
class
2.1461
2.1480
1.7557
1.7710
538
539
540
CONCLUSION
541
542
A machine learning approach has been presented to analyze time-specific wind field
543
characteristics and to predict the response of a wind turbine at different time periods. The
544
machine learning algorithms have been tested using the long-term monitoring data,
545
collected by an integrated life-cycle management system installed at a wind turbine in
546
Germany.
547
32
548
A Gaussian mixture model (GMM) approach is applied to the measured wind data to
549
construct the probability density function for wind statistical parameters in an attempt to
550
understand the characteristics of wind fields. We study the characteristics of the daytime
551
and nighttime wind fields by comparing the PDFs constructed based on the diurnal and
552
nocturnal wind data. The PDFs clearly show that the wind fields in the daytime are
553
characterized by faster mean wind speeds and higher levels of fluctuations than the
554
nighttime counterparts; this trend is effectively characterized by a multivariate probability
555
density function for wind field statistical parameters.
556
557
A Gaussian discriminative analysis (GDA) has been applied to find the relationship
558
between the wind input features and the response of the monitored wind turbine tower. In
559
this study, three input features include the mean wind speeds at 13 m and 67 m height,
560
and the interquartile values of the 20-minute wind speed time series measured at 13 m
561
height. The output response is the interquartile values of 20-minute wind turbine tower
562
acceleration time series. In general, wind turbine long-term monitoring data are collected
563
at limited locations and only statistical parameters (such as means, quartiles and standard
564
deviations) of time series data are archived. It may be worth noting that although the
565
interquartile values employed in this study is a reasonable measure analogous to RMS
566
values of the acceleration time series, the standard deviation or RMS values could be
567
more ideal features for quantifying the averaged magnitude of the acceleration or other
568
types of wind turbine response. In the future, the wind turbine monitoring system will be
569
updated to calculate and archive the standard deviations of wind speed and wind turbine
570
response time series.
33
571
572
By combining the multivariate probability density function and the response
573
classification function, useful information that are helpful to the operation of the wind
574
turbine can be obtained. This study examines the class probabilities and the expected
575
classes for the levels of wind turbine tower acceleration at daytime and nighttime and
576
used them to elucidate the reasons for different wind turbine tower vibrational response.
577
It becomes transparent that the expected response class for the daytime is higher than that
578
for the nighttime because daytime wind fields have higher mean wind speeds and higher
579
levels of turbulence, both of which are positively correlated to level of wind turbine
580
tower vibration.
581
582
The statistical machine learning approach presented in this paper can also be used to
583
study the influence of the daily, monthly, seasonal and yearly variations of wind field
584
characteristics on wind turbine response. If we establish the accurate wind turbine
585
response classification function for a specific wind turbine model and construct a site
586
specific probability density function for the wind input statistical features, the approach
587
can potentially be expanded to evaluate how the wind turbine will respond at that site. In
588
future research, we plan to develop an economical and practical methodology for
589
structure monitoring that uses wind input statistical features commonly collected at each
590
wind turbine but employs structural responses measured from only one or a few
591
representative wind turbines equipped with sensors.
592
593
34
594
ACKNOWLEDGEMENTS
595
596
This research is partially funded by the German Research Foundation (DFG) under grants
597
SM 281/1-1, SM 281/2-1, and HA 1463/20-1. Any opinions, findings, conclusions, or
598
recommendations are those of the authors and do not necessarily reflect the views of the
599
DFG.
600
601
REFERENCES
602
603
Akaike, H. (1974). “A new look at the statistical model identification,” IEEE Transaction
604
on Automatic Control, 19(6), pp.716–723.
605
606
Adobe Systems Incorporated. (2007). “Datasheet Adobe Flex Builder 3 – Create
607
engaging, cross-platform rich Internet applications,” San Jose, CA, USA: Adobe
608
Systems Incorporated.
609
610
Boyd, S. and Vandenberghe, L. (2004). “Convex Optimization,” Cambridge University
611
Press.
612
613
Castors, M. (2008). “PDI Performance tuning check-list,” Technical Report. Orlando, FL,
614
USA: Pentaho Corporation.
615
35
616
Fisher, R (1936). “The use of multiple measurements in taxonomic problems,” Annal
617
Eugen (7), pp.179–188.
618
619
Gonzalez-Longatt, F. M., Rueda, J. L. and Erlich, I., Bogdanov, D. and Villa, W. (2012).
620
“Identification of Gaussian Mixture Model using Mean Variance Mapping Optimization:
621
Venezuelan Case,” In: Proceedings of IEEE PES Innovative Smart Grid Technologies
622
Europe, Berlin, Germany, October 14, 2012.
623
624
Hastie, T, Tibshirani, R. and Friedman, J. (2008). “The elements of statistical learning,”
625
Springer.
626
627
Hartmann, D., Smarsly, K. and Law, K. H. (2011). “Coupling sensor-based structural
628
health monitoring with finite element model updating for probabilistic lifetime estimation
629
of wind energy converter Structures,” In: Proceedings of the 8th International Workshop
630
on Structural Health Monitoring 2011. Stanford, CA, USA, September 13, 2011.
631
632
International Electotechnical Commission. (2005). “Wind Turbines - Part 1: Design
633
Requirements,” IEC 61400-1, Edition 3.0.
634
635
Lachmann, S., Baitsch, M., Hartmann, D. and Höffer, R. (2009). “Structural lifetime
636
prediction for wind energy converters based on health monitoring and system
637
identification,” In: Proceedings of the 5th European & African Conference on Wind
638
Engineering. Florence, Italy, July 19, 2009.
36
639
640
Li, T., Zhu, S. and Ogihara, M. (2006). “Using discriminant analysis for multi-class
641
classification: an experimental investigation,” Knowledge and Information Systems, 10(4),
642
pp. 453-472.
643
644
Polanco, J. and Pedersen, A. (2009). “Understanding the Flex 3 Component and
645
Framework Lifecycle,” San Francisco, CA, USA: Development Arc LLC.
646
647
Pohar, M., Blas, M. and Turk, S. (2004). “Comparison of logistic regression and linear
648
discriminant analysis: a simulation study,” Metodološki zvezki, 1(1), pp.143-161.
649
650
Render, A. R. and Walker, F. H. (1984). “Mixture densities, maximum likelihood and the
651
eM algorithm,” SIAM Review, 26(2), pp. 195-239.
652
653
Reynolds, A. D., Quatieri, F. T. and Dunn, B. R. (2000). “Speaker verification using
654
adapted Gaussian Mixture Models,” Digital Signal Processing, 10, pp. 19-41.
655
656
Roldan, M.C. (2009). “Pentaho Data Integration (Kettle) Tutorial,” Technical Report.
657
Orlando, FL, USA: Pentaho Corporation.
658
659
Singh, R., Pal, C. B. and Jabr, A. R. (2010). “Statistical representation of distribution
660
system loads using Gaussian mixture model,” IEEE Transactions on Power Systems, 25
661
(1), pp. 29-37.
37
662
663
Smarsly, K. and Hartmann, D. (2009a). “Real-time monitoring of wind turbines based on
664
software agents,” In: Proceedings of the 18th International Conference on the
665
Applications of Computer Science and Mathematics in Architecture and Civil
666
Engineering. Weimar, Germany, July 7, 2009.
667
668
Smarsly, K. and Hartmann, D. (2009b). “Multi-scale monitoring of wind energy plants
669
based on agent technology,” In: Proceedings of the 7th International Workshop on
670
Structural Health Monitoring. Stanford, CA, USA, September 9, 2009.
671
672
Smarsly, K. and Hartmann, D. (2010). “Agent-oriented development of hybrid wind
673
turbine monitoring systems,” In: Proceedings of the 2010 EG-ICE Workshop on
674
Intelligent Computing in Engineering. Nottingham, UK, June 30, 2010.
675
676
Smarsly, K., Law, K. H. and Hartmann, D. (2012a). “A multiagent-based collaborative
677
framework for a self-managing Structural health monitoring system,” ASCE Journal of
678
Computing in Civil Engineering, 26(1), pp. 76-89.
679
680
Smarsly, K., Law, K. H. and Hartmann, D. (2012b). “Towards life-cycle management of
681
wind turbines based on structural health monitoring,” In: Proceedings of the First
682
International Conference on Performance-Based Life-Cycle Structural Engineering.
683
Hong Kong, China, December 5, 2012.
684
38
685
Smarsly, K., Hartmann, D. and Law, K. H. (2012c). “Integration of Structural Health and
686
Condition Monitoring into the Life-Cycle Management of Wind Turbines,” In:
687
Proceedings of the Civil Structural Health Monitoring Workshop. Berlin, Germany,
688
November 6, 2012.
689
690
Smarsly, K., Hartmann, D. & Law, K. H. (2013a). “A computational framework for life-
691
cycle management of wind turbines incorporating structural health monitoring,”
692
Structural Health Monitoring – An International Journal, 12(4), pp. 359-376.
693
694
Smarsly, K., Hartmann, D. & Law, K. H. (2013b). “An integrated monitoring system for
695
life-cycle management of wind Turbines,” International Journal of Smart Structures and
696
Systems, 12(2), pp. 209-233.
697
698
Stull, R. B. (1998). “An introduction to boundary layer meteorology,” Kluwer Academic
699
Publishers.
700
701
Stauffer, C. and Grimson, W.E.L. (1999). “Adaptive background mixture models for real-
702
time tracking,” In: Proceeding of IEEE Computer Society Conference on Computer
703
Vision and Pattern Recognition, Fort Collins, U.S.A., 23 Jun, 1999
704
705
706
39
Download