Forecasting Temperature and Solar Radiation: An Application of

advertisement
Forecasting Temperature and Solar Radiation: An Application of
Neural Networks and Neuro-Fuzzy Techniques
J. Zisos (1) , A. Dounis (2) , G. Nikolaou (2) ,G. Stavrakakis (1) , D.I. Tseles(2)
(1) Technical University of Crete, Chania, Greece
(2) T.E.I. of Piraeus, Athens, Greece
_____________________________________________________________________
Abstract
Neural and neuro-fuzzy systems are used, in order to forecast temperature and solar
radiation. The main advantage of these systems is that they don’t require any prior
knowledge of the characteristics of the input time-series in order to predict their future
values. These systems with different architectures have been trained using as inputdata measurements of the above meteorological parameters obtained from the
National Observatory of Athens. After having simulated many different structures of
neural networks and trained using measurements as training data, the best structures
are selected in order to evaluate their performance in relation with the performance of
a neuro-fuzzy system. As the alternative system, ANFIS neurofuzzy system is
considered, because it combines fuzzy logic and neural network techniques that are
used in order to gain more efficiency. ANFIS is also trained with the same data. The
comparison and the evaluation of both of the systems are done according to their
predictions, using several error metrics.
Keywords: Forecast, Neural Network, Neuro-Fuzzy system, ANFIS, error metrics
1. Introduction
There have been many different scientific efforts in order to realize better results
in the domain of forecasting meteorological parameters. Temperature and solar
radiation forecasting constitutes a very crucial issue for different scientific areas as
well as for many different aspects of everyday life.
In the present paper, many different structures of multilayer feedforward neural
networks have been developed and simulated, using as training and test data twenty
year measurements, obtained from the National Observatory of Athens. The aim is,
after a large number of simulations, using different parameters, to result at the best
possible structure of neural network forecaster of daily temperature and solar
radiation. Additionally, in order to observe the results of the combination of the
linguistic terms of fuzzy logic with the training algorithm of neural networks, in the
domain of forecasting, the neuro-fuzzy system ANFIS has been used. It has been
trained and tested with the same data as the neural networks, and its forecasting
results have been compared with these of the best neural network structures.
It has to be mentioned that the developed structures have been used for short-term
prediction.
The paper is structured as following: initially, there are presented the main aspects
of timeseries analysis and more specifically timeseries prediction as well as its
advantages. Next neural networks are described briefly. Besides there are reported
several structural training issues of neural network predictors. Afterwards neuro-fuzzy
systems and more specifically ANFIS are described briefly. Lastly the are presented
and compared the main results-predictions of temperature and solar radiation, after
simulating the above systems , using several error metrics.
Theoritical Part
2. Timeseries analysis - prediction
Timeseries is a stochastic process where the time index takes on a finite or
countably infinite set of values. These values usually constitute measurements of a
physical system obtained on specific time delays which might be hours,days,months
or years.
Timeseries analysis includes three basic problems:
 Prediction
 Modeling
 Characterization.
The one that concerns this paper is the prediction problem. Timeseries prediction
is defined as a method depicting past timeseries values to future ones. The method is
based on the hypothesis that the change of the values follows a specific model which
can be called latent model.
The timeseries model can be considered as a black box which makes no effort to
retrieve the coefficients which affect its behaviour.
Figure 1. Timeseries model block diagram
Input of the model are the past values X i until the time instance x=t and output Y is
the prediction of the model at the time instance x=t+p. The system could be a neural
network, a neurofuzzy system e.t.c
The main advantages of the timeseries prediction model are:
 In a lot of cases we need to identify what happens, not why it happens.
 The cost of the model is least in comparison with other categories of models
like the explanatory model.
Timeseries normalization
It constitutes one of the most frequently used preprocessing methods. Normalizing
data results in smoothing timeseries, as the values are limited in a specific range. It is
also very suitable for the neural and neuro-fuzzy systems which use activation
functions.
3. Neural networks
A neural network is a massively parallel distributed processor made up of simple
processing units, which has a natural propensity for storing experiential knowledge
and making it available for use.
Neural networks are characterized by several parameters. These are, the activation
functions used in every layer nodes, network different architectures, the learning
processes that are used.
The most commonly used activation functions are linear, sigmoid and hyperbolic
tangent activation functions.
Network architectures are depended to the learning algorithms used. Network
architectures can be separated in three catecories: Single-layer feedforward Networks,
Multilayer Feedforward Networks and Recurrent Networks.
Learning processes are separated in two main categories, supervised and
unsupervised learning. Their main difference is that supervised learning algorithms
use input-output patterns in order to be trained in contrast to unsupervised learning
algorithms whose response is based on the networks ability to self-organize.
Multilayer networks have been applied successfully to solve diverse problems by
training them in supervised manner with a highly popular algorithm known as error
back-propagation algorithm. Two kinds of signals are identified in this network:
function signals and error signals. A function signal propagates forward through the
network, and emerges at the output end of the network as an output signal. An error
signal originates at an output neuron of the network, and propagates backward
through the network. It is called an error signal because its computation by every
neuron of the network involves an error-dependent function. Target of the process is
to train network weights with input-output patterns, by reducing the error signal of the
output node, for every training epoch.
The error signal at the output of neuron j at iteration n is defined by:
e j (n)=t j (n)-y j (n).
The value of the error energy over all neurons in the output layer is defined by:
E(n)=
1
e 2j ( n )

2 jC
The average squared error energy or cost function is defined by:
Ε av =
1 N
 E(n )
N n1
Training process object is to adapt the parameters in order to minimize the cost
function. A very efficient minimization method is Levenberg-Marquardt. It is very
quick and has quite small computational complexity.
4. ANFIS (Adaptive Neuro-Fuzzy Inference System)
Neuro-fuzzy systems constitute an intelligent systems hybrid technique that
combines fuzzy logic with neural networks in order to have better results.
ANFIS can be described as a fuzzy system equipped with a training algorithm. It
is quite quick and has very good training results that can be compared to the best
neural networks.
Experimental Part
5. Solar Radiation and Temperature Timeseries
It has to be mentioned that the meteorological data used, come from the Greek
National Observatory, of Athens.
Solar Radiation
The data obtained were hourly measurements for the period 1981-2000 and were
measured in Watt 2 . After observing the data it’s obvious that apart from the logical
m
values, there are some data that are equal to -99.9. They constitute measurement
errors and have to be replaced. The object is to create the mean daily solar radiation
timeseries without error measurements in order to use it to the prediction systems. The
process used in order to create the timeseries described is the following:
1. Replacement -99.9 values with zeros
2. Computation of the average of non-zero hourly values for every 24 hours for
all the years
3. The mean daily values of the timeseries that are equal to zero are replaced
from the average of the previous and next values if they are non-zeros or else
from first previous non-zero value.
Temperature
The temperature data obtained, are for the period 1981-2003 and are measured in
C o . They also contain error measurements equal to -99.9 that have to be eliminated.
The object is to create the mean, maximum and minimum daily temperature
timeseries without the error measurements. The process used is the same as before
differing in the fact that in the temperature timeseries we are concerned to zero and
negative values in contrast with the solar timeseries that we care for the positive nonzero values of solar radiation.
In the following curves appear the daily timeseries created:
solar radiation
temperature
700
40
solar radiation daily measurements
temperature mean daily measurements
35
600
30
500
25
400
20
300
15
200
10
5
100
0
0
0
1000
2000
3000
4000
5000
6000
7000
8000
-5
0
1000
Mean daily solar radiation
2000
3000
4000
5000
6000
7000
8000
9000
Mean daily temperature
temperature
temperature
35
45
temperature minimum daily measurements
temperature maximum daily measurements
40
30
35
25
30
20
25
15
20
10
15
5
10
5
0
0
-5
-5
0
1000
2000
3000
4000
5000
6000
7000
Max daily temperature
8000
0
1000
2000
3000
4000
5000
6000
7000
9000
Min daily temperature
8000
9000
Normalization
In everyone of the above timeseries there are used the next two normalizing
transformations:
X  mean
1. Yk  k
, normalization with mean=0 , standard deviation=1
std
0 .8
0.1 * max  0.9 * min
* Xk +
2. Yk 
, normalization with maximum
max  min
max  min
value equal to 0.9 and minimum equal to 0.1.
6. Neural Network Predictors
The main object is to create many different structures of neural network predictors,
train and test them with the meteorological data available, in order to conclude to the best
and more efficient topology for forecasting solar radiation and temperature.
Initially it is appropriate to split the data to training and test data. There is not any
fixed theoretical model clarifying what percentage of the whole data should be the
training or the test data. Usually test data constitute a 20 to 30 percent of the overall data.
So, concerning the solar radiation data, they were split in four parts its one consisting of 5
years and the temperature data were split in three parts of 8 years of data each. The
question is which part of the data could be used as test data in order to constitute a
characteristic sample of the measurements. That’s because there may be a part of data that
during the training process can cause local minima, which essentially is a destruction of
the predictions. In order to solve this problem, multifold cross-validation has been
utilized. After implementing this method it was obvious that any part of data can be used
as test data as the test error was almost the same for every part. So, it was decided that for
the solar radiation the training data are the data of the years 1981 to 1995 and the test data
are the data of the years 1996 to 2000, and for the temperature the training data are the
data of the years 1981 to 1995 and the test data are the 1996 to 2003 data.
Next step is to decide the main structure of the neural network predictors. The main
question to this is the number of hidden layers used firstly, and the number of nodes
secondly, in order to avoid huge complexity and the overfitting problem. For the
meteorological timeseries used in this project, one hidden layer is appropriate and
enough. That’s because the first hidden layer is used in order to find the local
characteristics of the variable examined and that’s what the project is occupied with. So
there have been created and simulated the following cases of structures of neural
networks:
Inputs: 2,3,5,7 previous daily measurements for the 12 different timeseries created (real
and normalized data)
Number of hidden layers: 1(local characteristics)
Number of nodes of hidden layer: 2,3,5,10,15 (trial and error)
Output: 1 (One day prediction)
Activation Functions used in neurons: Hidden layer: sigmoid, linear
Output layer: linear
What is following is the training method. Training methods object is to minimize the
mean square error between the predictions and the real values. In order to find the most
appropriate training algorithm, there was created a small neural network 3-10-1,with
sigmoid activation function in the nodes of the hidden layer and it was simulated for
normalized data in the region 0.1-0.9 for 13 different algorithms that appear in the neural
toolbox of Matlab. Five error metrics were used in order to choose the most efficient
algorithm: MSE, RMSE, AME, NDEI, ρ. The algorithms used are the following:
1. Quasi Newton Backpropagation(trainbfg)
2. Bayesian regularization backpropagation(trainbr)
3. Conjugate gradient backpropagation with Powell-Beale restarts(traincgb)
4. Conjugate gradient backpropagation with Fletcher-Reeves updates(traincgf)
5. Conjugate gradient backpropagation with Polak-Ribiere updates(traincgp)
6. Gradient descent backpropagation(traingd)
7. Gradient descent with adaptive learning rate backpropagation(traingda)
8. Gradient descent with momentum backpropagation(traingdm)
9. Gradient descent with momentum and adaptive learning rate backpropagation
10. Levenberg-Marquardt backpropagation(trainlm)
11. One step secant backpropagation(trainoss)
12. Resilient backpropagation(trainrp)
13. Scaled conjugate gradient backpropagation(trainscg)
(traingdx)
0.14
0.16
0.025
X= 6
Y= 0.024803
X= 6
Y= 0.15749
X= 8
Y= 0.021149
X= 8
Y= 0.14543
0.14
0.12
X= 6
Y= 0.12143
0.02
X= 8
Y= 0.11173
0.12
0.1
X= 7
Y= 0.11516
0.1
X= 1
Y= 0.10636
X= 2
Y= 0.10589
X= 4
Y= 0.107
X= 3
Y= 0.10625
X= 9
Y= 0.11261
X= 5
Y= 0.10694
X= 10
Y= 0.10581
X= 11
Y= 0.10702
X= 12
Y= 0.10779
X= 13
Y= 0.10651
0.08
X= 9
Y= 0.01268
X= 12
X= 11
X= 10
Y= 0.011454 Y= 0.011618
Y= 0.011196
X= 5
Y= 0.011436
X= 9
Y= 0.083727
X= 11
X= 13
X= 12
X= 10
Y= 0.077858 Y= 0.078352 Y= 0.078275 Y= 0.078438
0.06
X= 13
Y= 0.011345
X= 2
Y= 0.011213
0.01
0.08
X= 7
Y= 0.083261
X= 5
X= 4
X= 1
X= 2
X= 3
Y= 0.077941 Y= 0.077805 Y= 0.077763 Y= 0.078125 Y= 0.078672
A.M.E.
M.S.E.
X= 7
Y= 0.013262
X= 4
X= 3
Y= 0.011289 Y= 0.01145
X= 1
Y= 0.011312
R.M.S.E.
0.015
0.06
0.04
0.04
0.02
0.005
0.02
0
0
0
1
2
3
4
5
6
7
8
Training Algorithms
9
10
11
12
1
2
3
4
5
6
13
Mean Square Error
7
8
Training Algorithms
9
10
11
12
1
2
13
Root Mean Square Error
3
4
5
6
7
8
Training Algorithms
9
10
0.8
0.3
X= 6
Y= 0.28548
X= 1
Y= 0.80972
X= 2
Y= 0.81151
X= 3
Y= 0.81023
X= 4
Y= 0.80722
X= 5
Y= 0.80731
X= 9
Y= 0.78415
X= 7
Y= 0.78334
X= 10
Y= 0.81192
X= 11
X= 12X= 13
Y= 0.80693 Y=
Y=0.80899
0.80388
X= 13
Y= 0.80899
0.7
X= 6
Y= 0.63006
X= 8
Y= 0.26362
0.25
X= 8
Y= 0.68168
0.6
X= 7
Y= 0.20876
N.D.E.I.
X= 3
Y= 0.19261
X= 4
Y= 0.19397
X= 9
Y= 0.20413
X= 5
Y= 0.19385
X= 2
Y= 0.19195
X= 11
Y= 0.19401
X= 12
Y= 0.19539
X= 13
Y= 0.19308
0.5
X= 10
Y= 0.19181
p
X= 1
Y= 0.1928
0.2
0.4
0.15
0.3
0.1
0.2
0.05
0.1
0
0
1
2
3
4
5
6
7
Training Algorithms
8
9
10
11
12
13
Normalized Root Mean Square
Error Index
1
2
3
4
5
6
7
8
Training Algorithms
9
10
11
12
13
Correlation Coefficient
From the above curves it’s obvious that the most efficient algorithm is the LevenbergMarquardt Backpropagation algorithm. Even in comparison with algorithms whose
metrics have close values with Levenbergs, Levenberg is better as it’s converging more
quickly.
Error Metrics
MSE 
 ( x(k )  xˆ (k ))
n
1
n
2
RMSE 
k 1
 ( x(k )  xˆ(k ))
n
1
n
AME 
 ( x(k )  xˆ(k ))
k 1
 x (k )
n
2
k 1
 x(k )  xˆ(k ))
n
1
n
k 1
k 1
 ( x(k )  x )  ( xˆ(k )  xˆ )
n
RMSE
NDEI 


2
n
2

k 1
 ( x(k )  x )   ( xˆ(k )  xˆ )
n
n
2
k 1
12
13
Absolute Mean Error
0.9
0.35
11
2
k 1

Where x(k) is the real value in time instance k, x (k) the prediction of the model, n the
number of test data used for the prediction.
It has to be mentioned that the most characteristic error criterion showing the quality
of prediction was proved be the correlation coefficient criterion (ρ). As the prediction
improves, ρ is getting close to 1.
Neural Networks training and prediction results
Test Measurement-Prediction with Neural Network
Mse=11.1818 Rmse=3.3439 Ame=2.882 p=0.97689 Ndei=0.16569
Test Measurement-Prediction with Neural Network
Mse=7394.6479 Rmse=85.9921 Ame=64.4325 p=0.81751 Ndei=0.22886
Measurements
Predictions
550
500
450
400
350
Temperature Daily Measurements - Predictions
Mean Solar Radiation Daily Measurements - Predictions
After having trained and tested all the different cases and structures of neural
networks with the different in normalization and type meteorological time-series, they
have been compared according to the above error criteria in order to result in the most
suitable neural predictors structure for every different time-series.
 In the beginning there were chosen the best four neural networks for every
different type of normalization
 Next there were chosen the best four neural network predictors for every different
type of meteorological time-series in order to use them to more complex systems
like neural network committee machines.
 Finally there was made choice of the best neural network predictor for every
different time-series in order to be able to compare its results with ANFIS or any
other system created.
The best neural network predictors for every different time-series are introduced:
1. Mean daily Solar radiation: 5-15-1, normalized in the range 0.1-0.9, using
sigmoid activation function
2. Mean daily Temperature: 5-10-1, normalized in the range 0.1-0.9, using sigmoid
activation function
3. Maximum daily Temperature: 7-5-1, normalized in the range 0.1-0.9, using
sigmoid activation function
4. Minimum daily Temperature: 2-15-1, normalized in the range 0.1-0.9, using
sigmoid activation function
18
14
12
10
8
6
4
1070
750
800
850
900
Number of samples
950
Mean daily solar radiation predictions vs measurements
with 5-15-1 N.N. for a small sample of data
Measurements
Predictions
18
16
14
12
10
8
1090
1100
Number of samples
1110
1120
1130
Test Measurement-Prediction with Neural Network
Mse=27.7587 Rmse=5.2687 Ame=4.6675 p=0.94254 Ndei=0.31335
Min Temperature Daily Measurements - Predictions
20
1080
Mean daily temperature predictions vs measurements
with 5-10-1 N.N. for a small sample of data
Test Measurement-Prediction with Neural Network
Mse=6.4888 Rmse=2.5473 Ame=1.9196 p=0.96458 Ndei=0.10388
Max Temperature Daily Measurements - Predictions
Measurements
Predictions
16
15
Measurements
Predictions
10
5
0
1050 1055 1060 1065 1070 1075 1080 1085 1090 1095 1100
Number of samples
Maximum daily temperature predictions vs
measurements with 7-5-1 N.N. for small a sample of
data
1050 1055 1060 1065 1070 1075 1080 1085 1090 1095
Number of samples
Minimum daily temperature predictions vs
measurements with 2-15-1 N.N. for a small sample of
data
7. ANFIS
600
Measurements
Predictions
550
500
450
400
350
300
250
200
800
820
840
860 880 900 920
Number of samples
940
960
980
Max Temperature Daily Measurements - Anfis Predictions
Mean daily solar radiation predictions vs measurements
with ANFIS for a small sample of data
Test Measurement-Prediction with Anfis
Mse=6.4536 Rmse=2.5404 Ame=1.9207 p=0.96413 Ndei=0.10359
20
Measurements
Predictions
18
16
14
12
10
8
6
1060
1065
1070
1075
1080
1085
Number of samples
1090
1095
1100
Test Measurement-Prediction with ANFIS
Mse=9.183 Rmse=3.0304 Ame=2.6213 p=0.97699 Ndei=0.15011
Measurements
Predictions
14
12
10
8
6
4
1080 1085 1090 1095 1100 1105 1110 1115 1120 1125
Number of samples
Mean daily temperature predictions vs measurements
with ANFIS for a small sample of data
Min Daily Temperature Measurements - ANFIS Predictions
Solar Daily Measurements - ANFIS Predictions
Test Measurement-Prediction with ANFIS
Mse=7543.572 Rmse=86.8537 Ame=64.3385 p=0.81305 Ndei=0.22346
Mean Daily Temperature Measurements - ANFIS Predictions
For the training and testing of data there was created a first Sugeno type system
consisting of 7 inputs. The combination of fuzzy logic with neural networks proved to
have very good results in the daily solar radiation and temperature forecasting.
Test Measurement-Prediction with ANFIS
Mse=25.5036 Rmse=5.0501 Ame=4.4378 p=0.93309 Ndei=0.30015
18
Measurements
Predictions
16
14
12
10
8
6
4
2
0
-2
1040
Maximum daily temperature predictions vs measurements
with ANFIS for small a sample of data
1050
1060
1070
Number of samples
1080
1090
1100
Minimum daily temperature predictions vs
measurements with ANFIS for a small sample of data
8. ‘Best’ Neural Network predictors vs ANFIS
Below there is presented a comparison between the ‘best’ N.N. predictors and
ANFIS, for the four different timeseries, using as criterion the metric that proved to be
the most accurate, which is the correlation coefficient (ρ).
Mean Daily Solar radiation
Ν.N. 5-15-1 ρ=0.81751
ΑΝFIS
ρ=0.81305
Mean Daily Temperature
Ν.N. 5-10-1 ρ=0.97689
ΑΝFIS
ρ=0.97699
Maximum Daily Temperature
Ν.N. 7-5-1 ρ=0.96458
ΑΝFIS
ρ=0.96413
Minimum Daily Temperature
Ν.N. 2-15-1 ρ=0.94254
ΑΝFIS
ρ=0.93309
9.Conclussions
Concerning neural networks that were created and simulated there have been
excluded the below conclusions:
 The most efficient training algorithm proved to be the Levenberg-Marquardt
Backpropagation, as it surpassed the others not only in quality of results but
also in speed.
 The best type of normalization proved to be the one in region 0.1-0.9 in
combination with sigmoid activation function in the hidden layer nodes.
 Most suitable choice of inputs is 5 or 7 previous days data.
 The number of nodes used in the hidden layer depends from the type and the
complexity of the timeseries, and from the number of inputs.
Concerning neuro-fuzzy predictors and more specifically ANFIS it is proved
that the combination of linguistic rules of fuzzy logic with the training algorithm
used in neural networks, contribute in very qualitative prediction results, which
approach the ‘best’ neural predictors results.
Download