SIM UNIVERSITY SCHOOL OF SCIENCE AND TECHNOLOGY DESIGN A SYSTEM TO RECOMMEND ENVIRONMENTAL WEATHER CONDITION FOR A NEW TOWN STUDENT : CHER SOON PENG (PI NO. H0605379) SUPERVISOR : MR NAVA SELVARATNAM PROJECT CODE : JAN09/BEHE/02 A project report submitted to SIM University in partial fulfilment of the requirements for the degree of Bachelor of Engineering NOV 2009 TABLE OF CONTENT Abstract i Statement of Assistance i List of Abbreviation ii List of Figure iii Chapter 1. Chapter 2. Introduction Literature Research 2.1 Why does 4D VAR beat 3D VAR 2.2 Euler Lagrange equation 2.3 The Operational Mesogamma-Scale Analysis and Forecast System of the U.S Army Test and Evaluation Command 2.4 Dew point Temperature Prediction 2.5 Overview of Artificial Neural Network Chapter 3. Objectives 3.1 Objectives of Project 3.2 Main Technique Used Chapter 4. Project Management 4.1 Gantt Chart Chapter 5. Design of project 5.1 Introduction of Climate model 5.2 Description on the type of Climate model adopted 5.3 Type of software used 5.4 Description of the Program Chapter 6. Experimentation of Program 6.1 Type of test conducted Chapter 7. Results 7.1 Model 1 7.2 Model 2 Chapter 8. Conclusion 8.1 Achievements 8.2 Challenges Faced Chapter 9. Recommendation for further improvement Chapter 10. Critical Review and Reflections 1-3 4-6 7-8 9-10 11-13 14-16 17 17 18-19 20 20-21 22 22 23-24 25-26 27 28 28 29 30 References R-1 Appendix 1. Hourly Average Air Temperature for Wilma, Florida from 28 September to 7 September A-1 – A-5 Appendix 2. MATLAB Program A-6 Abstract Human activities are closely affected by the environmental weather condition surrounding them whether in war conflict environment or in sports arena, weather condition is an important aspect to be considered to determine the degree of success. In extreme cases, it may poise as a health hazard to some. Extreme weather such as drought, typhoon and flood destroyed properties and caused death to human lives. Some of this weather condition can be predicted and necessary preventive action can be taken to mitigate the losses to the economy and most importantly the lives of human being. In this project, a Neural Network will be created using MATLAB software to predict the average hourly air temperature of Wilma, Florida. Statement of Assistance Special thanks to my project supervisor, Mr Nava Selvaratnam who provide me with clear and concise explanation whenever I am in doubts and make this capstone project a successful one. Western Regional Climate Centre, who provides me with the necessary data for studies and used in this capstone project. i List of Abbreviation Page 4DWX MRF 3D VAR FGAT MEP ANN EBP MATLAB Four Dimensional Weather Medium- Range Forecast 3 Dimensional Variation scheme First Guess at Appropriate Time Maximum Entropy Production Artificial Neural Network Error Back Propagation Matrix Laboratory 2, 9, 10 3 4 4 7 11-16, 21, 22 11 17, 22, 28 ii List of Figure Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. Figure 7. Figure 8. Network architecture of Neural Network with 30 hidden layers Network architecture of Neural Network with 60 hidden layers Predicted Average Air Temperature from 28 September to 7 October Observed Average Air Temperature from 28 September to 7 October Predicted / Observed Average Air Temperature from 28 September to 7 October for Model 1 Error margin between Predicted and Observed average air temperature for Model 1 Predicted / Observed Average Air Temperature from 28 September to 7 October for Model 2 Error margin between Predicted and Observed average air temperature for Model 2 iii Chapter 1. Introduction Weathers play an important role in human daily lives. Therefore meteorologists have been in existence since 350 BC. Since then, instruments were invented to collect information so as to enable meteorologist to forecast weather more accurately. From primitive rain gauge, anemometer and barometer to remote sensing equipment such as radar, lidar and satellite which have higher accuracy in collecting data. Future technology for public weather service is in radar remote sensing arena. The next generation radar system, the dual polarisation radar, phased array radar provide opportunity to improve severe weather detection, rainfall estimates and winter weather warnings and increase the lead time for severe weather hazards. With technology improving, meteorologist makes use of climate models and computer to process information and forecast the weather. Climate Model is a mathematical simulation of the processes that affect the atmosphere and produce local weather and the climate over large region. The first Climate Model was developed in late 1960s by National Oceanic and Atmospheric Administration’s Geophysical Fluid Dynamics Laboratory in Princeton, New Jersey. The development of this first climate model was based on Geophysical Fluid Dynamics Laboratory founder Joseph Smagorinsky’s belief that only a completely new approach to scientific endeavour that departed from the independent, individual mode of inquiry would produce answers to extremely complex problems. The first model developed was a combination of oceanic and atmospheric processes. It allows scientists to understand how ocean and atmosphere interact with each other to influence climate. The model also predicted how changes in the natural factors that control climate such as ocean and atmospheric currents and temperature could lead to climate change. Soon, more Climate model was developed by meteorologist such as the simplest 1 Dimension Variation model using Euler Lagrange equation which is based on maximum entropy production hypothesis. This model calculates latitudinal distribution of the long emitted long wave radiation and meridional heat transport for a given latitudinal distribution of the absorbed solar radiation. In the earth usual climate system, the absorbed solar radiation dominates the emitted long wave radiation at low latitude but long wave radiation dominates at high latitudes. The emitted long wave radiative energy and absorbed solar radiation energy are balanced globally and integration of the net radiative flux at the top of the atmosphere from the South Pole to each latitude provides the required northward heat transport. These are basic and important characteristics of the earth’s climate system but not sufficient to determine the climate state. A simple model based on Maximum Entropy Production principle can calculate meridional distribution of the surface temperature and clouds amount, without treating detailed physical processes, for given distributions of the insolation and surface albedo. By treating the problem analytically, an Euler Lagrange equation and a numerical method solving it are obtained. For latitudinal distributions of the absorbed solar radiation and emitted long wave radiation, the notation I(θ) and O(θ) are used respectively, as functions of the latitude θ. 1 The integration of the net radiation of the net radiative flux, provides a measure of northward heat transport and the integration provides an entropy production rate associated with the above heat transport, where a is the earth radius, T(θ) is an equivalent temperature related to by the equation. σ is the Stefan-Boltzmann constant. We can assume that I(θ) is a given function and try to obtain an appropriate distribution of O(θ) for such an I(θ). A function J denoting the energy flux as Where µ is the sine of the latitude (µ=sin θ). The Euler Lagrange equation for a variation problem with a functional K[y]=∫F(y’(x), y(x), x)dx is given by We obtain the Euler Lagrange equation of this problem as The general form of ad joint equation is Where λ is the ad joint variable, is the adjoint operator of the linearization of f and g is the distribution function of a cost function. Another type of Climate model developed was Chemical Transport model which was a three dimensional model that uses observed or analysed wind, moisture, temperature and other meteorological conditions to calculate the transport of chemical substances through the atmosphere and reactions among them as a function of time. Chemical Transport model includes the processes by which chemical species are converted to aerosols and by which they are incorporated into rain and washed to the ground. The model is used to compute the way the distribution of aerosols varies between place and time. The recent collaborated development by the United States Army Test and Evaluation Command is a completely new meteorological support infrastructure called the Four Dimensional Weather (4DWX) system. The 4DWX modelling system is currently based on the fifth generation Pennsylvania State University National Centre for Atmospheric Research Mesoscale Model (MM5). The model has non-hydrostatic dynamics, a two way interactive nesting procedure with coarse grids that provide boundary conditions for fine grids running at smaller time steps and with feedback from fine grids to coarse grids, and a radiative upper boundary condition that mitigates noise resulting from the reflection of vertically propagating waves. It also has timed-dependent lateral-boundary conditions, relaxed towards large scale model forecast. A nudging zone of five rows and columns is specified at the model lateral 2 boundaries, with a nudging weight that allows the model variable tendencies to relax gradually to the larger scale model forecasts along the boundary. The model uses the modified Medium- Range Forecast (MRF) model boundary layer parameterization. The MRF parameterization is a non local mixing scheme. The Richardson number is used to determine the depth of the boundary layer. Clouds effects on radiation are allowed for shortwave radiation, and the Rapid Radiative Transfer Model is used for long wave radiation. The “Noah” land model with four soil layers is used. Soil moisture and soil temperature are predicted at each grid point based on substrate and atmospheric properties. The model has a land surface data assimilation system that diagnoses current substrate moisture and temperature using in situ and remotely sensed data. The model has 36 computational levels, with approximately 12 levels within the lowest 1km. Newtonian relaxation method was used as data assimilation. Data assimilation by Newtonian relaxation is accomplished by adding non-physical nudging terms model predictive equation. These terms for model solution are each grids point to observation or analyses of observations, in proportion to the difference between the model solution and the data or analysis. This approach was used because it is relatively efficient computationally, it is robust and allows the model to ingest data continuously rather than intermittently, the full model dynamics are part of the assimilation system so that analyses contain all locally forced mesoscales features, and it does not unduly complicated the structure of the model code. The implementation of Newtonian relaxation in the 4DWX system forces the model solution towards observations rather than toward analyses of the data. This approach was chosen because observations on the mesoscale are sometimes sparse and typically are not very uniformly distributed in space, making objective analysis difficult. With station nudging, each observation is ingested into the model at its observed time and location, with proper space and time weights, and the model spreads the information in time and space according to the model dynamics. Word Count: 1206 3 Chapter 2. Literature Review 2.1 Why does 4D VAR beat 3D VAR? 3D VAR (3 Dimensional Variation scheme) was introduce into operational global numerical weather prediction (NWP) in 1999. Completion of the Perturbation Forecast model and its ad joint and availability at additional computer power enabled operational global 4D VAR in 2004. A Series of experiments were designed to find out what aspects of 4D-VAR caused the improvement over 3D-VAR. In particular the experiments are designed to distinguish two possible causes: allowance for the actual time of each observation and the use of time evolved co variances to provide some flow dependent structures [1]. 4 variation assimilation schemes were used in the experiment. They are basic 3D VAR, 3-D VAR with FGAT (First Guess at Appropriate Time), 4D VAR and basic 4D VAR. VAR is the incremental variational minimisation. In basic 3D VAR scheme, all observation in the time window T-3 to T+3 (where T is the analysis time) are treated as if they were at T+0 which is close to their average time. Incremental variation minimisation searches for the lower-resolution increment which, when added to the full model predictions, minimizes a penalty measuring the deviations from the observations and from the background state at T+0. The analysed increment is initialised by running both backwards and forwards with PF model, combining the results using a digital filter before reconfiguring to the resolution of the full model, and then is added to the background at T+0 to create an analysis. This is used to start a 6-hour forecast and provide the fields at T+6 needed for the next cycle. 3D VAR scheme with FGAT is created by exploiting the increment al formulation to provide the full model input to the estimate of the observations at the actual time of observation. The full model state is saved at regular intervals hourly and these are then interpolated to the actual time and position of each observation. Only the deviations between the full model predictions and the observations are treated as if they were at T+0. The VAR step and initialisation are identical to basic 3D VAR; the forecast produces fields from T+3 to T+9 for the next cycle. The variation analysis increment for 4D VAR set up is at T-3. At each iteration of the minimisation the Perturbation Forecast model is integrated from T-3 to T+3 to give increments valid at the time of each observation; the ad joint model then calculates gradients needed for minimising a penalty measuring the deviations from the observations and from the background state at T-3. After the same initialisation, the analysed increment is added to the full model background at T-3, to start a 12 hour forecast giving the analysis at T+0 as well as fields at the times needed for the next cycle. 4 Basic 4D VAR differs from 3D VAR with FGAT in 2 ways: the observed increments are treated at the correct time, and the increment fields are evolving from T-3 using Perturbation Forecast model. In order to distinguish these, a new scheme called ‘synoptic 4D VAR’ is created. This treats the observations exactly as 3D VAR with FGAT. For example, it assumes all the increments are valid at T+0. The main difference is that the analysed increments are evolved to this time from T-3 using Perturbation Forecast model, with corresponding penalty gradients evolved back to T-3 using an evolved background-error covariance. A minor difference is that the final analysis is evolved to T+0 using the full model. Basic 4D VAR has a similar evolution of the increments; its differs from synoptic 4D VAR only in evaluating increments at the correct time for each observation, rather than at T+0. Results has shown that by allocating for the time of observation in the FGAT had a positive impact on the scores; using evolved co-variances had a bigger impact but it made bigger changes to the analyses; hence increasing the spread of the distribution due to the chaotic growth of differences, allowing for the time of observation in the increments had similar benefits to FGAT. Other than the forecasting skills, which are obviously vital for NWP application, there are other measures of analyses quality. A measure commonly relied on the fit of the background to observation. For example, a verification of the 6 hour forecasts. The left hand side curves show the global fit to radiosonde winds measure within the variation analysis at the first iteration. The left hand side curves show the fit of analyses at the last iteration to observations used. This is a harder statistic to interpret qualitatively. As the new surface the 3D VAR analyses were able to fit the observations more closely than the 4D VAR scheme. This is possible because the very simple physical parameterisations of the PF model could not generate the correct frictional effects on the wind, whereas 3D VAR has more freedom to alter their winds more directly to fit the observations. In the free troposphere, 3D VAR with FGAT shows a closer fit of both background and the analyses, but his introduction of evolved covariance with synoptic 4D VAR cancels out this improvement. Finally allowing for increments at the approximate time in basic 4D VAR makes the closeness of fit similar again to 3D with FGAT. The benefit seen from FGAT is an example of that to be gain from attention to detail in high quality data assimilation scheme, where predictions of observed values can have smaller errors than the observations themselves. Once error levels are low it is important to address all sources of error, including mismatches in time, in order to reduce them further. The benefits from evolved covariance are offset by errors due to Perturbation Forecast model and modes not properly represented by the Perturbation Forecast model. Improvement to the Perturbation Forecast model and model error term should allow much longer time windows. It is also possible to obtain the benefit seen in our 5 synoptic 4D VAR experiments by using cleverly constructed flow dependent covariance in 3D VAR. Basic 4D VAR allow for the time of each observation in its implicit evolved covariance, whereas synoptic 4D VAR used covariance for the average time, T+0. The benefit seen from this was large, as our data selection was designed to capture observations near synoptic times where possible. A significantly bigger benefit is to be expected by including more data at non-synoptic times. For global models the most important observations are sounding from polar orbiting satellites; a 12 hours time window is needed to get two soundings at most points. For regional models where the typical observation frequency is higher, the benefit should be apparent with shorter window. 6 2.2 Euler-Lagrange equation A simple 1D dimensional climate model in the meridional direction is considered based on the maximum entropy production hypothesis. This model calculates latitudinal distribution of the long emitted long wave radiation and meridional heat transport for a given latitudinal distribution of the absorbed solar radiation [2]. In the earth usual climate system, the absorbed solar radiation dominates the emitted long wave radiation at low latitude but long wave radiation dominates at high latitudes. The emitted long wave radiative energy and absorbed solar radiation energy are balanced globally and integration of the net radiative flux at the top of the atmosphere from the South Pole to each latitude provides the required northward heat transport. These are basic and important characteristics of the earth’s climate system but not sufficient to determine the climate state. A simple model based on Maximum Entropy Production (MEP) principle can calculate meridional distribution of the surface temperature and clouds amount, without treating detailed physical processes, for given distributions of the insolation and surface albedo. By treating the problem analytically, an Euler Lagrange equation and a numerical method solving it are obtained. For latitudinal distributions of the absorbed solar radiation and emitted long wave radiation, the notation I(θ) and O(θ) are used respectively, as functions of the latitude θ. The integration of the net radiation of the net radiative flux, provides a measure of northward heat transport and the integration provides an entropy production rate associated with the above heat transport, where a is the earth radius, T(θ) is an equivalent temperature related to by the equation. σ is the Stefan-Boltzmann constant. We can assume that I(θ) is a given function and try to obtain an appropriate distribution of O(θ) for such an I(θ). A function J denoting the energy flux as Where µ is the sine of the latitude (µ=sin θ). The Euler Lagrange equation for a variational problem with a functional K[y]=∫F(y’(x), y(x), x)dx is given by 7 We obtain the Euler Lagrange equation of this problem as The general form of ad joint equation is Where λ is the ad joint variable, is the adjoint operator of the linearization of f and g is the distribution function of a cost function. 8 2.3 The Operational Mesogamma-Scale Analysis and Forecast System of the U.S Army Test and Evaluation Command This 4DWX has been used at seven U.S Army test ranges, because most tests have weather related environmental and safety constraints, forecasts are required for test scheduling and now casts are required for test conduct. These post test analyses of meteorological condition affecting test results require a model based data assimilation system to dynamically interpolate between observations when it is not possible to place sensors at the test location. The Army Test and Evaluation Command (ATEC) collaborated on the development and implementation of a completely new meteorological support infrastructure called the 4DWX system. The 4DWX system at each test range is tailored to meet the specific needs of that range. The 4DWX modelling system is currently based on the fifth generation Pennsylvania State University National Centre for Atmospheric Research Mesoscale Model (MM5). The model has non-hydrostatic dynamics, a two way interactive nesting procedure with coarse grids that provide boundary conditions for fine grids running at smaller time steps and with feedback from fine grids to coarse grids, and a radiative upper boundary condition that mitigates noise resulting from the reflection of vertically propagating waves. It also has timed-dependent lateral-boundary conditions, relaxed towards large scale model forecast. A nudging zone of five rows and columns is specified at the model lateral boundaries, with a nudging weight that allows the model variable tendencies to relax gradually to the larger scale model forecasts along the boundary. The model uses the Modified Medium- Range Forecast (MRF) model boundary layer parameterization. The MRF parameterization is a non local mixing scheme. The Richardson number is used to determine the depth of the boundary layer. Clouds effects on radiation are allowed for shortwave radiation, and the Rapid Radiative Transfer Model is used for long wave radiation. The “Noah” land model with four soil layers is used. Soil moisture and soil temperature are predicted at each grid point based on substrate and atmospheric properties. The model has a land surface data assimilation system that diagnoses current substrate moisture and temperature using in situ and remotely sensed data. The model has 36 computational levels, with approximately 12 levels within the lowest 1km. Newtonian relaxation method was used as data assimilation. Data assimilation by Newtonian relaxation is accomplished by adding non-physical nudging terms model predictive equation [5]. These terms for model solution are each grids point to observation or analyses of observations, in proportion to the difference between the model solution and the data or analysis. This approach was used because it is relatively efficient computationally, it is robust and allows the model to ingest data continuously rather than intermittently, the full 9 model dynamics are part of the assimilation system so that analyses contain all locally forced mesoscales features, and it does not unduly complicated the structure of the model code. The implementation of Newtonian relaxation in the 4DWX system forces the model solution towards observations rather than toward analyses of the data. This approach was chosen because observations on the mesoscale are sometimes sparse and typically are not very uniformly distributed in space, making objective analysis difficult. With station nudging, each observation is ingested into the model at its observed time and location, with proper space and time weights, and the model spreads the information in time and space according to the model dynamics. 10 2.4 Dew point Temperature Prediction Dew point temperature is the temperature at which water vapour in the air will condense into dew or water droplets given that the air pressure remains constant. Dew point temperature is critical to the survival of plants especially in regions that have infrequent rainfall. In 2003, Hubbard developed a regression model for estimating the daily average dew point temperature, using the daily mean, minimum, and maximum air temperature as inputs. The research conducted used 14 years of data for six cities. The accuracy of this regression model has a mean absolute error of 2.2 ◦c for the most accurate regression equation. An ANN is a robust computational technique modelled after biological neuron connections found in human brains. Like human brain, ANN is repeatedly exposed to inputs and varies the strength of the connections between neurons based on those inputs. Thus, ANN is accomplished using an iterative process instead of single calculation as would be used with most type of regression and Bayesian classification. ANNs have been used to help solve many real world problems such as pattern matching classification, and prediction ANNs have been used in the atmospheric sciences. In the early days, ANN used for prediction of ozone concentration, tornados, thunderstorms, solar radiation, carbon dioxide and monsoon rainfall. In year 2000, it had been developed using 38 years of rain fall data to predict monthly and yearly precipitation levels for multiple site in the Czech Republic and in various areas of western Sydney, Australia, it have been used as short term prediction focused on predict flash flood rainfall amount for 15 minutes ahead. The Canadian used an ensemble of ANN to provide 24 hours prediction for average temperature, wind speed and humidity at Regina Airport in Canada. A separate ANN (ANN) model incorporating the error back propagation (EBP) algorithm was developed. The EBP ANN consists of artificial neurons, called nodes, arranged into three layers: input, hidden and output. The input layer receives the data on case at a time; on or more hidden layers connect the input and output layers and the output layer is interpreted as the prediction, classification, or pattern. Each node at each layer is connected to some or all the nodes in the next layer and each connection has a weight which changes the value going through that connection. The nodes in the hidden layer and output layer can receive inputs from several nodes. These inputs are summed and then presented to an activation function. An error is calculated as the difference between the ANN output and the observed value associated with that input. The partial derivatives of that error are used to adjust the weights using a gradient descent. 11 An EBP ANN model has two nodes. The first is a feed forward node where a set of inputs xi, where I ranges from 1 to I, is mapped to a single output z by the following equations: where are the weights from the input layer to the hidden layer are the weights from the hidden layer to the output node, and is the output of the nodes in the hidden layer, where j range from 1 to J. The logistic activation function g is defined as follows: Where n is the input to the activation function. The hyperbolic tangent, Gaussian, and Gaussian complement are the respective components of the hidden layer activation function f, defined as follows: Where n is the input to the activation function. The second mode of the ANN is back propagating the error to adjust the weights. The weight adjustment for each weight from the hidden layer to output node is defined as The weight adjustment for each weight from the input layer to the hidden layer is defined as Where n is the learning rate and t is the target output value. The nodes y0 and x0 are bias nodes that are always set to 1, although their corresponding weights are adjusted. The model yields the following result, the mean absolute error for the 1, 4, 8 and 12 hours prediction models are 0.550◦C, 1.234◦C, 1.799◦C and 2.281◦C, respectively, with a coefficient of determination (r2) of 0.993, 0.964, 0.924 and 12 0.889, respectively. As the mean absolute error values increased and the r2 values decreased as the lead time increased. There was also a tendency to over predict at low dew point temperatures. A comparison was conducted for the final results model evaluation with prediction using the current dew point temperature as the predicted temperature for the same observations. The improvement if the ANN model over the current dew point temperature was 0.035◦C for the 1 hour model, 0.162◦C for the 2 hours model, 0.212◦C for the 3 hours model, and varied between 0.3◦C and 0.4◦C for the 4-12 hours models. The percentage improvement for the 2-10 hours models was relatively similar, ranging from 15.5% to 21.4%, but differed at the lower end, where the 1 hour model improved by only 6%and at higher end where the 11 hours model improvement was 14.7% and the 12 hours model improvement was 11.9%. In this project, a similar approach will be made to create the climate model using the ANN presented. 13 2.5 Overview of Artificial Neural Network. Nature has developed a very complex neuronal morphology in biological species. Biological neurons, over one hundred billion in number, in central nervous systems of humans play a very important role in the various complex sensory, control, affective and cognitive aspects of information processing and decision making. In neuronal information processing, there are a variety of complex mathematical operations and mapping functions that act in synergism in a parallel cascade structure forming a complex pattern of neuronal layers evolving into a sort of pyramidal pattern. The information flows from one neuronal layer to another in the forward direction with continuous feedback and it evolves into a dynamic pyramidal structure. The structure is pyramidal in the sense of the extraction and convergence of information at each point in the forward direction. A study of biological neuronal morphology provides not only a clue but also a challenge in the design of a realistic cognitive computing machine. The ANN is modelled based on the biological neural network. Like the biological neural network, the ANN is an interconnection of nodes, analogous to neurons [5]. Each neural network has three critical components: node character, network topology, and learning rules. Node character determines how signals are processed by the node, such as the number of inputs and outputs associated with the node, the weight associated with each input and output, and the activation function. Network topology determines the ways nodes are organised and connected. Learning rules determine how the weights are initialised and adjusted. Node Character The basic model for a node in the ANN is shown below. Each node receives multiple inputs from others via connections that have associated weights, analogous to the strength of the synapse. When the weighted sum of inputs exceed the threshold value of the node, it activates and passes the signal through a transfer function and send it to neighbouring nodes. This process can be express as a mathematical model: Where y is the output of the node, f is the function, is the weight of input and T is the threshold value. The transfer function has many forms. A nonlinear transfer function is more useful than the linear ones, since only a few problems are linear separable. The simplest one is the step function: 14 The sigmoid function also is often used as the activation function, since the function and its derivative are continuous. X1 W1 f X2 y W2 node Xn Wn ` Network Topology In ANN, the nodes are organised into linear arrays called layers. Usually, there are inputs layers, output layers and hidden layers. There can be none to several hidden layers. Designing the network topology involves the determining the number of nodes at each layer, the number of layers in the network, and the path of the connections among the nodes. Usually, those factors are initially set by intuition and optimized through multiple cycles of experiments. Also some rational methods can be used to design a neural network. For example, the genetic neural network uses a generic algorithm to select the input features for the neural network solving quantitative structure-activity relationship problem. There are two types of connection made between nodes. One is a one way connection with no loop back. The other is a loop back connection in which the output of the nodes can be the input to previous or same level nodes. Based on the aforementioned type of connections, neural networks can be classified into two types: feed forward network and feedback network. Because the signal travels one way only, the feed forward network is static; that is, one input is associated with one particular output. The feedback network is dynamic. For one input, the state of the feedback changes for many cycles, until it reaches an equilibrium point, so one input produce a series of outputs. Preceptron is a widely used feed forward network. Learning rules The ANN uses a learning process to train the network. During the training, weights are adjusted to desired values. The learning can be classified into two major categories: supervised learning and unsupervised learning. In supervised learning, a training set, that is, examples of inputs and corresponding target 15 outputs, is provided. The weights are adjusted to minimise the error between the network and the correct output. Special consideration is needed to construct the training set. The ideal training set must be representative of the underlying model. An unrepresentative training set cannot produce a very reliable and general model. For networks using supervised learning, the network must be trained first. When the network produces the desired outputs for a series of inputs, the weights are fixed and the network can be put in operation. In contrast, unsupervised does not use target output values from a training set. The network tries to discover the underlying pattern or trend in the input data alone. Different types of networks require different learning processes. Many different learning schemes have been invented for ANNs to achieve different learning goals. The most frequently used learning approaches are error correction methods and nearest neighbour methods. Error correction methods normally have a back propagation mechanism. Let yk,n be the output of the kth output node at step n and yk be the target output for the kth node. An error function can be defined as the difference between the node output and target output: The Back Propagation Algorithm is an iterative gradient algorithm designed to minimize the mean square error between the actual output and the desired output. This algorithm is also known as “The generalised delta rule” The neurons in layers, other than input and output layers are called the hidden units or hidden nodes, as their outputs do not directly interact with the environment. With the Back Propagation Algorithm, the weights associated with the hidden layers an also be adjusted and thus enable the ANNs to learn. 16 Word Count: 3938 Chapter 3. Objectives 3.1 Objectives of Project The objective of this project is to design a system to recommend environmental weather condition for a new town with reference from a particular climate model. In this project, ANN is used to predict the environmental weather condition for the new town. An in depth description of the climate model will be discuss in Chapter 5 of this report. 3.2 Main Technique Used MATLAB is used as the platform to perform the mathematical computation for the Neural Network. The computation on the MATLAB programme is based on the Theory of Neuronal Approximation using feed forward neural network with back propagation algorithm. The theory of functions approximation is an important class of problems in both static and dynamic processes. The theory of neuronal approximations has captured the attention of neural scientists at the IEEE First International Conference of Neural Networks in 1987 held in San Diego when R. HechtNielsen reiterated the theorem of Kolmogorov’s theorem states that one can express a continuous multivariable function, on a compact domain, in terms of sums and compositions of single variable functions. [5] The number of single variable functions required is finite. It implies that there are no nemesis functions that cannot be approximated by neural networks. The parallel and layered morphology of the neural systems is responsible for solving a wider class of problems in fields such as system approximation, control, learning and adaptation. The functional approximation capability of feed forward neural network architecture is one of the properties of the neural structures and had potentials for applications to problems such as system identification, recognition. A feed forward network structure may be treated as a rule for computing the output values of the neurons in the ith layer using the output values of the (i-1)th layer, hence implementing class of mapping from the input space xn to the output space xm. The Back Propagation Algorithm is an iterative gradient algorithm designed to minimize the mean square error between the actual output and the desired output. This algorithm is also known as “The generalised delta rule” The neurons in layers, other than input and output layers are called the hidden units or hidden nodes, as their outputs do not directly interact with the environment. 17 With the Back Propagation Algorithm, the weights associated with the hidden layers an also be adjusted and thus enable the ANN to learn. Word Count: 403 18 Chapter 4. 4.1 Project Management Gantt Chart Jan 09 Activities Literature Review Collation of information Writing Initial Report Review on Climate Model Start Completion Date Date 11 Jan 30 Aug 9 Feb 1 Mar 9 Feb 1 Mar 2 Mar 5 Apr 6 Apr 26 Apr 13 Apr 26 Apr 4 May 2 Aug 3 Aug 23 Aug 17 Aug 30 Aug 31 Aug 13 Sep 14 Sep 27 Sep 28 Sep 11 Oct 12 Oct 25 Oct 26 Oct 8 Nov 16 Nov 28 Nov Feb 09 Mar 09 Apr 09 May 09 Jun 09 Jul 09 Aug 09 Sep 09 Oct 09 Nov 09 Duration Determine the Type of Climate Model used Writing of Interim Report Creating climate model Evaluating Climate Model Troubleshooting of Climate Model Writing of skeleton for final report Writing of final report Formatting and finalising content for report Send supervisor corrected version for binding Design of Project Poster Presentation preparation Table 1: Project Plan Schedule 19 Literature review is to gather information on meteorology. As I am not familiar in the field of meteorology, more time is require for me to read up on journals, books and internet articles on this particular subject. These will be done at NUS, NTU and National Libraries. As most of this journals and books could not be loan out of the libraries therefore, reading can only be done at their premises. Collation of information is to collate all the information found and prepare it for the drafting of the initial report. Writing initial report. After the writing the initial report, all the information gathered are pieced together. This gives me a better idea and understanding project. It also helps me in planning the time schedule for this project. Review on climate model. This is to find out the advantages and disadvantages of some of the climate model being used so as to help me have a better perspective in creating the climate model. Creating climate model. The program I will use to create the climate model for simulation is MATLAB. As I am more familiarise in using the program. With the help of the workshop on MATLAB organise by UniSIM, I hope that I will have a better knowledge in using the program to create the model. Evaluating climate model. After the simulation program had been created, I will evaluate the accuracy of the program with real data input using the charts and graphs. After evaluating the program, I will do a review on the program where necessary. Troubleshooting of climate model. If program unable to produce the right result, necessary adjustment will be made to the program. Writing of skeleton for final report. A content page will be draft out with the headings and sub headings to determine the content of the final report. Writing of final report. This report will illustrate all the research and findings from this project, thus more time is plan on writing the final report. Formatting and finalising content of report. The report will be send to project supervisor for critics and necessary adjustment will be made before sending the final version for binding. Design of project poster. This is to abstract the main essence from the final report and present the result and findings on a A1 size poster. Presentation preparation. This includes design the layout of the presentation poster and organise my presentation. Word Count: 401 20 Chapter 5. Design of Project 5.1 Introduction of Climate Model Climate model is a mathematical simulation of the processes that affect the atmosphere and produce weather prediction of a particular small area like a town or over large regions of land. It is used as a tool to help us to understand the processes and forecasting how the changes may affect weather and climate. Climate model usually consists of a computer program such in the form of series of equation such as the ANN, Chemical Transport Model and Euler Lagrange equation. These models allow the physical laws controlling the weather to be applied but this is possible only to an extent because not all climate processes are fully understood. Generally, a climate model can be classified base on the different dimensional grids which most models began the construction with. The grid may be a two dimensional or three dimensional. Two dimensional models may represent two horizontal dimensions, like the lines of latitude and longitude on a map. Initial data on factors such as air pressure, temperature, humidity, cloud amount and wind speed are applied to each intersection between the grids lines where all calculation of the physical effects are made. The sensitivity of the model depends on the scale of the grid. The smaller the distances between each grid lines, the more accuracy it will have but this will also resulted in an increased of number of calculations and therefore, the computing power of the program will also be increased. The more advance supercomputer are used to construct and run the more complex models and the reliability of the models is directly proportional to the amount of computational powers, however, all models make assumptions about factors that are not well understood. All models are subjected to various kinds of test before they can be used to estimate the consequences of change. These tests are conducted by using the data from recent past then run the program to observe how well it simulates the weather conditions that were actually recorded. If the test is successful, the model will then be used to simulate conditions from the more distant past as an aid to understanding how those conditions developed. After all these test, the model will then be used to predict the future weather conditions. 5.2 Description on the type of Climate Model adopted The type of Climate Model adopted will be Neural Network or ANN which is a robust computational technique modelled after biological neuron connections found in human brain. It is one of the areas of the current research and is attracting people from a wide variety of disciplines of science and technology. It can be used to predict average air temperature, wind speed and humidity. In ANN, the fundamental unit that is used is an approximated mathematical model of a neuron. The connection strength between layers is called weight. The process of adjustment of weights is called learning or training. Learning 21 procedure is constructing new representations and the results of learning can be viewed as numerical solutions to the problem of whether to use local or distributed representations. The learning can be classified into two major categories: supervised learning and unsupervised learning. In supervised learning, a training set, that is, examples of inputs and corresponding target outputs, is provided. The weights are adjusted to minimise the error between the network and the correct output. An EBP algorithm can be incorporated with ANN. It consists of artificial neurons, called nodes, which is arranged in three layers. The input, hidden and output layer. The input layer is where the data is being entered into the network while the output layer is the results or the prediction of the data. Each node at each later is connected to some or all the nodes in the next layer and each connection has a weight which changes the value going through that connection. The nodes in the hidden layer and output layer can receive inputs from several nodes. These inputs are added and presented to an activation function. An error is computed as the variance between the output data and the observed data with the input. The partial derivatives of that error are used to adjust the weights using gradient descent. As the number of nodes increases, it will reduce the error but the time taken for the computation will also increase. An EBP ANN model has two nodes. The first is a feed forward node where a set of inputs xi, where I ranges from 1 to I, is mapped to a single output z by the following equations: where are the weights from the input layer to the hidden layer are the weights from the hidden layer to the output node, and is the output of the nodes in the hidden layer, where j ranges from 1 to J. The logistic activation function g is defined as follows: Where n is the input to the activation function. The hyperbolic tangent, Gaussian, and Gaussian complement are the respective components of the hidden layer activation function f, defined as follows: 22 Where n is the input to the activation function. The second mode of the ANN is back propagating the error to adjust the weights. The weight adjustment for each weight from the hidden layer to output node is defined as The weight adjustment for each weight from the input layer to the hidden layer is defined as Where n is the learning rate and t is the target output value. The nodes y0 and x0 are bias nodes that are always set to 1, although their corresponding weights are adjusted [7]. 5.3 Type of Software used The type of software that will be used in this project is MATLAB. It will be used to compute and predict the average temperature using the Neural Networks tool box. As compare to other Neural Network software such as DTREG and Neural Tools, the advantage of MATLAB is it is a more user friendly software and the software is also widely used in other modules in my course of studies therefore I am able to use the program more efficiently and effectively. 5.4 Description of the Program MATLAB stands for matrix laboratory. It is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in familiar mathematical notation. Typical uses include mathematical computation, algorithm development, data acquisition, modelling, simulation, data analysis, scientific and engineering graphics. MATLAB is an interactive system whose basic data element is an array that does not require dimensioning [9]. This allows you to solve many technical computing problems, especially those with matrix and vector formulations, in a fraction of the time it would take to write a program in a scalar noninteractive language such as C or FORTRAN. 23 Word Count: 1169 Chapter 6. Experimentation of Program 6.1 Type of test conducted In this project, the hourly average air temperature will be taken between 18 September 2009 and 7 October 2009 for the town of Wilma, Florida [8]. The first 10 days of data which consist of 240 samples will be used as the input data that will be fed to the Neural Network and the next 10 days of data between 28 September 2009 to 7 October 2009 will be used for validation on the performance of the Neural Network. This set of data is included in Appendix 1 of this report. A comparison will be made between the predicted and the observed hourly average air temperature to determine the accuracy of the Neural Network. Mean Square Error will be used to quantify the difference between the predicted air temperature and the observed air temperature. Mean Square Error measure the average of the square of the error. The error is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator does not account for the information that could produce a more accurate estimate. During the training phase, Levenberg-Marquardt optimization is used as the training method as it is able to trains a neural network 10 to 100 times faster than the usual gradient descent back propagation method. It always computes the approximate Hessian matrix, which has dimensions n-by-n. The parameters for Model 1 will be set as follows, 60% of the samples will be randomly selected to be used for training, these set of data are presented to the network during training phase and the network is adjusted according to its error. 20% of it will be used for testing which is used to measure network generalization, and to halt training when generalization stops improving. 20% will be used for validation, these set of data has no effect on training and so provide an independent measure of network performance during and after training. The number of hidden layers or neurons will be set at 30. The network architecture is shown in Figure 1 below. Figure 1. Network architecture of Neural Network with 30 hidden layers 24 The parameters for the Model 2 will be set similarly as the Model 1 but the numbers of hidden layers or neurons set will be set at 60. This is to determine if the increase in number of neurons will affect the performance of the Neural Network. The network architecture is shown in Figure 2 below. Figure 2 Network architecture of Neural Network with 60 hidden layers Word Count: 426 25 Chapter 7. Results 7.1 Model 1 After running the Neural Program, the predicted hourly average air temperature from 28 September to 7 October by Model 1 of the Neural Network as explained in the previous chapter as illustrated in Figure 3. While Figure 4 shows the observed hourly air temperature for the same period. This set of result yield a Mean Square Error of 0.0825. Figure 3. Predicted Average Air Temperature from 28 September to 7 October Figure 4. Observed Average Air Temperature from 28 September to 7 October 26 From Figure 5, we can clearly see that the predicted hourly air temperature is close to the observed hourly air temperature. The error margin in temperature between the predicted and the observed temperature is illustrated in Figure 6. The error in temperature varies between -1.2℃ to 1.6℃. Figure 5. Predicted / Observed Average Air Temperature from 28 September to 7 October for Model 1 Figure 5 Figure 6. Error margin between Predicted and Observed average air temperature for Model 1 With reference from the Mean Square Error of the predicted temperature, we are able to conclude that the accuracy of this Model is fairly accurate as there are a few occasions that the variance margin is more than 0.5℃ 27 7.2 Model 2 In Model 2 of the Neural Network, the number of hidden neurons is increase to 60. The predicted and observed hourly air temperature is illustrated in Figure 7. This set of result yield a Mean Square Error of 0.0816. The error in temperature varies between -1.08℃ to 1.96℃. Figure 7. Predicted / Observed Average Air Temperature from 28 September to 7 October for Model 2 Figure 8. Error margin between Predicted and Observed average air temperature for Model 1 For Model 2, the number of hidden neurons increases from 30 to 60 and this reduces the Mean Square Error from 0.0825 to 0.0816 which implies that a further increase in the number of hidden neurons is unnecessary as it had negligible effect on the performance of the Neural Network. Word Count 266 28 Chapter 8. Conclusion 8.1 Achievement During the 10 months of completing this capstone project, I have gain better knowledge in the field of meteorology through the various journals, books and articles that I have read and review. Especially in weather forecasting, the main research topic that I am focusing on in this project. I have also learnt how to use the Neural Network Tool in the MATLAB software to predict the weather with a set of given input data. Other then the knowledge gains in the various aspects, a lot of soft skills such as project and time management using the Gantt chart were also learnt. As this project spreads over 10 months, time management and discipline in doing the project by following the timeline as plan is important so as not to complete in project at the last hour. 8.2 Challenges Faced The main challenge faced during while doing this project is the initial process of gathering information. This is my first time dealing with the subject on meteorology therefore I am unfamiliar with it. There are lots of resources available in the libraries to widen my knowledge on meteorology but with limited time, I will have to be selective on the area that I need to review. Time commitment in completing the project on time is another challenge. As we are working full time therefore, work commitment will take a higher priority but nevertheless, some sacrifices were made to allow me to complete this project within the stipulated time. Word Count: 250 29 Chapter 9. Recommendation for further improvement In this project, only one parameter, the average air temperature is being used as for computation. Further improvement such as to include multiple parameters like Relative Humidity, Solar Radiation and Wind Speed etc. as the input to predicted if a particular day is rainy, sunny or windy. Another possible improvement that can be consider is to improve on the region that is to be forecasted as currently, the region that is consider is a size of a town. Word Count: 84 30 Chapter 10. Critical review and reflection After completing this capstone project, which marks the final stage of my studies for the degree programme I am glad that I have learnt a lot of new knowledge as well as soft skills especially while completing this capstone project. During the 10 months of completing this capstone project, there were times that obstacles was faced especially in the beginning of 10 months of this journey but with regular meetings with my project supervisor, Mr Nava Selvaratnam, who provide constructive advices and words of encouragement to spur me on to complete this capstone project on time. Word Count: 102 Grand Total: 8484 31 REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] Andrew C. Lorenc and F. Rawlins, “Why Does 4D-VAR Beat 3D-VAR” Quarterly Journal of Royal Meteorological Society Vol. 131, Oct 2005, Part C Pg 3247. Shigenori Murakami and Akio Kitoh, “Euler-Lagrange Equation of the Simple 1-D Climate Model Based on the Maximum Entropy Production Hypothesis” Quarterly Journal of Royal Meteorological Society, Vol. 131, Apr 2005, Part B Pg 1529. Yubao Liu, Thomas T.Warner, James F.Bowlers, Laurie P.Carson, “The Operational Mesogamma-Scale Analysis and Forecast System of the US Army Test and Evaluation Command” Journal of Applied Meteorology and Climatology, Vol. 47, April 2008, Pg 1077. D.B Shank, G. Hoogenboom, R.W McClendon, “Dew point Temperature Prediction using Artificial Neural Networks” Journal of Applied Meteorology and Climatology, Vol. 47, June 2008, Pg 1757. Livingstone, David, Artificial Neural Network: Methods and Application, 2008. Simon Haykin, Neural Networks, A Comprehensive Foundation, 1999. Madan M.Gupta, Liang Jin, Noriyasu Homma, Static and Dynamic Neural Networks from Fundamentals to Advanced Theory, 2003. http://www.wrcc.dri.edu/cgi-bin/wea_list.pl?laFWIL (last access on 12 October 2009) Howard Demuth, Mark Beale, Martin Hagan, Neural Network Toolbox™ 6 User’s Guide. Steven k. Rogers, Matthew Kabrisky, An introduction to Biological & Artificial Neural Network for pattern recognition, 1991. Phil Picton, Neural Networks Second Edition, 2000. Jacek M. Zurada, Introduction to Artificial Neural Systems, 1992. R-1 Appendix 1. Hourly Average Air Temperature for Wilma, Florida from 18 September to 7 October Date Time Deg C Date Time Deg C DD/MM/YYYY hh:mm Average Air Temp DD/MM/YYYY hh:mm Average Air Temp 18/9/2009 0:00 23.33 19/9/2009 13:00 31.11 18/9/2009 1:00 23.33 19/9/2009 14:00 24.44 18/9/2009 2:00 23.33 19/9/2009 15:00 24.44 18/9/2009 3:00 23.33 19/9/2009 16:00 23.89 18/9/2009 4:00 23.33 19/9/2009 17:00 23.89 18/9/2009 5:00 23.33 19/9/2009 18:00 23.89 18/9/2009 6:00 23.33 19/9/2009 19:00 23.89 18/9/2009 7:00 22.22 19/9/2009 20:00 23.33 18/9/2009 8:00 23.33 19/9/2009 21:00 23.33 18/9/2009 9:00 24.44 19/9/2009 22:00 22.78 18/9/2009 10:00 26.11 19/9/2009 23:00 22.78 18/9/2009 11:00 28.33 20/9/2009 0:00 23.33 18/9/2009 12:00 29.44 20/9/2009 1:00 23.33 18/9/2009 13:00 30.56 20/9/2009 2:00 22.78 18/9/2009 14:00 29.44 20/9/2009 3:00 22.78 18/9/2009 15:00 30.56 20/9/2009 4:00 22.78 18/9/2009 16:00 30.56 20/9/2009 5:00 22.78 18/9/2009 17:00 30.56 20/9/2009 6:00 22.78 18/9/2009 18:00 27.78 20/9/2009 7:00 23.33 18/9/2009 19:00 25.56 20/9/2009 8:00 23.89 18/9/2009 20:00 24.44 20/9/2009 9:00 25.56 18/9/2009 21:00 23.89 20/9/2009 10:00 28.33 18/9/2009 22:00 23.33 20/9/2009 11:00 30.56 18/9/2009 23:00 23.33 20/9/2009 12:00 31.11 19/9/2009 0:00 22.22 20/9/2009 13:00 32.22 19/9/2009 1:00 22.78 20/9/2009 14:00 31.67 19/9/2009 2:00 22.22 20/9/2009 15:00 32.78 19/9/2009 3:00 22.78 20/9/2009 16:00 31.67 19/9/2009 4:00 23.33 20/9/2009 17:00 31.11 19/9/2009 5:00 22.78 20/9/2009 18:00 30 19/9/2009 6:00 22.78 20/9/2009 19:00 25.56 19/9/2009 7:00 22.78 20/9/2009 20:00 24.44 19/9/2009 8:00 25 20/9/2009 21:00 23.33 19/9/2009 9:00 27.78 20/9/2009 22:00 22.78 19/9/2009 10:00 30 20/9/2009 23:00 22.78 19/9/2009 11:00 31.11 21/9/2009 0:00 23.33 19/9/2009 12:00 28.89 21/9/2009 1:00 22.78 A-1 Date Time Deg C Date Time Deg C DD/MM/YYYY hh:mm Average Air Temp DD/MM/YYYY hh:mm Average Air Temp 21/9/2009 1:00 22.78 22/9/2009 17:00 30 21/9/2009 2:00 23.33 22/9/2009 18:00 28.33 21/9/2009 3:00 23.33 22/9/2009 19:00 26.11 21/9/2009 4:00 23.33 22/9/2009 20:00 24.44 21/9/2009 5:00 23.33 22/9/2009 21:00 23.89 21/9/2009 6:00 23.33 22/9/2009 22:00 23.33 21/9/2009 7:00 23.89 22/9/2009 23:00 22.78 21/9/2009 8:00 25 23/9/2009 0:00 22.22 21/9/2009 9:00 26.67 23/9/2009 1:00 22.22 21/9/2009 10:00 26.67 23/9/2009 2:00 21.67 21/9/2009 11:00 26.67 23/9/2009 3:00 21.67 21/9/2009 12:00 29.44 23/9/2009 4:00 21.67 21/9/2009 13:00 31.67 23/9/2009 5:00 21.11 21/9/2009 14:00 30.56 23/9/2009 6:00 21.11 21/9/2009 15:00 28.89 23/9/2009 7:00 21.11 21/9/2009 16:00 31.67 23/9/2009 8:00 23.33 21/9/2009 17:00 30.56 23/9/2009 9:00 27.22 21/9/2009 18:00 30.56 23/9/2009 10:00 29.44 21/9/2009 19:00 26.67 23/9/2009 11:00 31.11 21/9/2009 20:00 25.56 23/9/2009 12:00 32.22 21/9/2009 21:00 25 23/9/2009 13:00 32.22 21/9/2009 22:00 24.44 23/9/2009 14:00 32.78 21/9/2009 23:00 23.89 23/9/2009 15:00 33.89 22/9/2009 0:00 23.89 23/9/2009 16:00 32.78 22/9/2009 1:00 23.89 23/9/2009 17:00 32.22 22/9/2009 2:00 23.33 23/9/2009 18:00 30.56 22/9/2009 3:00 23.33 23/9/2009 19:00 26.67 22/9/2009 4:00 23.33 23/9/2009 20:00 27.22 22/9/2009 5:00 23.89 23/9/2009 21:00 27.22 22/9/2009 6:00 23.89 23/9/2009 22:00 25.56 22/9/2009 7:00 23.89 23/9/2009 23:00 25 22/9/2009 8:00 25.56 24/9/2009 0:00 24.44 22/9/2009 9:00 27.78 24/9/2009 1:00 23.33 22/9/2009 10:00 29.44 24/9/2009 2:00 22.78 22/9/2009 11:00 30.56 24/9/2009 3:00 22.78 22/9/2009 12:00 31.67 24/9/2009 4:00 22.78 22/9/2009 13:00 31.67 24/9/2009 5:00 22.22 22/9/2009 14:00 32.22 24/9/2009 6:00 22.78 22/9/2009 15:00 31.67 24/9/2009 7:00 22.78 22/9/2009 16:00 30.56 24/9/2009 8:00 23.89 A-2 Date Time Deg C Date Time Deg C DD/MM/YYYY hh:mm Average Air Temp DD/MM/YYYY hh:mm Average Air Temp 24/9/2009 9:00 26.67 26/9/2009 1:00 22.22 24/9/2009 10:00 29.44 26/9/2009 2:00 22.22 24/9/2009 11:00 30 26/9/2009 3:00 22.22 24/9/2009 12:00 31.11 26/9/2009 4:00 22.22 24/9/2009 13:00 32.78 26/9/2009 5:00 21.67 24/9/2009 14:00 32.78 26/9/2009 6:00 21.11 24/9/2009 15:00 32.78 26/9/2009 7:00 21.11 24/9/2009 16:00 32.78 26/9/2009 8:00 24.44 24/9/2009 17:00 32.22 26/9/2009 9:00 28.89 24/9/2009 18:00 27.22 26/9/2009 10:00 31.67 24/9/2009 19:00 25 26/9/2009 11:00 32.22 24/9/2009 20:00 23.89 26/9/2009 12:00 32.78 24/9/2009 21:00 23.33 26/9/2009 13:00 34.44 24/9/2009 22:00 23.89 26/9/2009 14:00 32.78 24/9/2009 23:00 23.33 26/9/2009 15:00 32.78 25/9/2009 0:00 22.78 26/9/2009 16:00 28.33 25/9/2009 1:00 22.22 26/9/2009 17:00 30.56 25/9/2009 2:00 22.22 26/9/2009 18:00 28.33 25/9/2009 3:00 22.22 26/9/2009 19:00 25.56 25/9/2009 4:00 22.22 26/9/2009 20:00 24.44 25/9/2009 5:00 22.22 26/9/2009 21:00 24.44 25/9/2009 6:00 22.22 26/9/2009 22:00 23.89 25/9/2009 7:00 22.78 26/9/2009 23:00 24.44 25/9/2009 8:00 25 27/9/2009 0:00 23.89 25/9/2009 9:00 28.89 27/9/2009 1:00 23.89 25/9/2009 10:00 31.11 27/9/2009 2:00 22.78 25/9/2009 11:00 31.67 27/9/2009 3:00 22.78 25/9/2009 12:00 32.22 27/9/2009 4:00 22.22 25/9/2009 13:00 33.89 27/9/2009 5:00 21.67 25/9/2009 14:00 32.78 27/9/2009 6:00 21.67 25/9/2009 15:00 31.67 27/9/2009 7:00 22.22 25/9/2009 16:00 29.44 27/9/2009 8:00 23.89 25/9/2009 17:00 28.89 27/9/2009 9:00 26.11 25/9/2009 18:00 27.78 27/9/2009 10:00 27.22 25/9/2009 19:00 26.11 27/9/2009 11:00 27.22 25/9/2009 20:00 25 27/9/2009 12:00 27.22 25/9/2009 21:00 23.89 27/9/2009 13:00 28.89 25/9/2009 22:00 23.33 27/9/2009 14:00 30 25/9/2009 23:00 23.33 27/9/2009 15:00 29.44 26/9/2009 0:00 22.78 27/9/2009 16:00 30.56 A-3 Date Time Deg C Date Time Deg C DD/MM/YYYY hh:mm Average Air Temp DD/MM/YYYY hh:mm Average Air Temp 27/9/2009 18:00 28.33 29/9/2009 11:00 29.59 27/9/2009 19:00 23.33 29/9/2009 12:00 27.79 27/9/2009 20:00 21.67 29/9/2009 13:00 29.76 27/9/2009 21:00 21.11 29/9/2009 14:00 24.12 27/9/2009 22:00 20.56 29/9/2009 15:00 24.12 27/9/2009 23:00 20 29/9/2009 16:00 23.53 28/9/2009 0:00 21.72 29/9/2009 17:00 23.53 28/9/2009 1:00 21.72 29/9/2009 18:00 23.53 28/9/2009 2:00 21.64 29/9/2009 19:00 23.53 28/9/2009 3:00 21.58 29/9/2009 20:00 21.72 28/9/2009 4:00 21.55 29/9/2009 21:00 21.72 28/9/2009 5:00 21.87 29/9/2009 22:00 22.67 28/9/2009 6:00 21.75 29/9/2009 23:00 22.98 28/9/2009 7:00 22.69 30/9/2009 0:00 21.72 28/9/2009 8:00 21.72 30/9/2009 1:00 21.72 28/9/2009 9:00 22.80 30/9/2009 2:00 22.53 28/9/2009 10:00 23.65 30/9/2009 3:00 22.53 28/9/2009 11:00 25.23 30/9/2009 4:00 22.53 28/9/2009 12:00 28.54 30/9/2009 5:00 22.53 28/9/2009 13:00 29.20 30/9/2009 6:00 22.53 28/9/2009 14:00 28.54 30/9/2009 7:00 21.72 28/9/2009 15:00 29.20 30/9/2009 8:00 24.01 28/9/2009 16:00 28.78 30/9/2009 9:00 23.67 28/9/2009 17:00 29.52 30/9/2009 10:00 25.87 28/9/2009 18:00 24.91 30/9/2009 11:00 29.74 28/9/2009 19:00 23.89 30/9/2009 12:00 29.57 28/9/2009 20:00 23.95 30/9/2009 13:00 29.90 28/9/2009 21:00 23.53 30/9/2009 14:00 29.53 28/9/2009 22:00 21.72 30/9/2009 15:00 30.11 28/9/2009 23:00 21.72 30/9/2009 16:00 29.53 29/9/2009 0:00 23.12 30/9/2009 17:00 29.29 29/9/2009 1:00 23.14 30/9/2009 18:00 29.17 29/9/2009 2:00 22.58 30/9/2009 19:00 23.89 29/9/2009 3:00 22.53 30/9/2009 20:00 23.95 29/9/2009 4:00 21.72 30/9/2009 21:00 21.72 29/9/2009 5:00 22.43 30/9/2009 22:00 22.53 29/9/2009 6:00 22.67 30/9/2009 23:00 22.53 29/9/2009 7:00 22.50 1/10/2009 0:00 21.72 29/9/2009 8:00 24.04 1/10/2009 1:00 22.53 29/9/2009 9:00 24.91 1/10/2009 2:00 21.67 A-4 Date Time Deg C Date Time Deg C DD/MM/YYYY hh:mm Average Air Temp DD/MM/YYYY hh:mm Average Air Temp 1/10/2009 4:00 21.72 2/10/2009 21:00 23.33 1/10/2009 5:00 21.72 2/10/2009 22:00 22.14 1/10/2009 6:00 22.16 2/10/2009 23:00 22.53 1/10/2009 7:00 23.78 3/10/2009 0:00 22.58 1/10/2009 8:00 24.04 3/10/2009 1:00 22.58 1/10/2009 9:00 23.60 3/10/2009 2:00 21.65 1/10/2009 10:00 23.60 3/10/2009 3:00 23.45 1/10/2009 11:00 23.60 3/10/2009 4:00 22.43 1/10/2009 12:00 28.54 3/10/2009 5:00 22.41 1/10/2009 13:00 29.53 3/10/2009 6:00 22.45 1/10/2009 14:00 29.20 3/10/2009 7:00 21.86 1/10/2009 15:00 26.79 3/10/2009 8:00 21.72 1/10/2009 16:00 29.53 3/10/2009 9:00 27.60 1/10/2009 17:00 29.20 3/10/2009 10:00 28.54 1/10/2009 18:00 29.20 3/10/2009 11:00 29.27 1/10/2009 19:00 22.54 3/10/2009 12:00 29.90 1/10/2009 20:00 23.67 3/10/2009 13:00 29.90 1/10/2009 21:00 24.08 3/10/2009 14:00 30.11 1/10/2009 22:00 24.24 3/10/2009 15:00 32.58 1/10/2009 23:00 23.53 3/10/2009 16:00 30.11 2/10/2009 0:00 23.53 3/10/2009 17:00 29.90 2/10/2009 1:00 23.53 3/10/2009 18:00 29.20 2/10/2009 2:00 21.72 3/10/2009 19:00 24.35 2/10/2009 3:00 21.72 3/10/2009 20:00 27.84 2/10/2009 4:00 21.72 3/10/2009 21:00 27.90 2/10/2009 5:00 23.53 3/10/2009 22:00 24.21 2/10/2009 6:00 23.53 3/10/2009 23:00 24.04 2/10/2009 7:00 23.53 4/10/2009 0:00 23.95 2/10/2009 8:00 23.89 4/10/2009 1:00 21.72 2/10/2009 9:00 24.91 4/10/2009 2:00 22.53 2/10/2009 10:00 28.54 4/10/2009 3:00 22.53 2/10/2009 11:00 29.20 4/10/2009 4:00 22.53 2/10/2009 12:00 29.53 4/10/2009 5:00 22.58 2/10/2009 13:00 29.53 4/10/2009 6:00 22.53 2/10/2009 14:00 29.79 4/10/2009 7:00 22.55 2/10/2009 15:00 28.78 4/10/2009 8:00 23.56 2/10/2009 16:00 29.35 4/10/2009 9:00 23.60 2/10/2009 17:00 28.90 4/10/2009 10:00 28.54 2/10/2009 18:00 25.23 4/10/2009 11:00 28.90 2/10/2009 19:00 23.45 4/10/2009 12:00 29.27 A-5 Date Time Deg C Date Time Deg C DD/MM/YYYY hh:mm Average Air Temp DD/MM/YYYY hh:mm Average Air Temp 4/10/2009 14:00 30.11 6/10/2009 7:00 22.78 4/10/2009 15:00 30.11 6/10/2009 8:00 23.64 4/10/2009 16:00 30.11 6/10/2009 9:00 27.96 4/10/2009 17:00 29.90 6/10/2009 10:00 29.53 4/10/2009 18:00 27.60 6/10/2009 11:00 30.12 4/10/2009 19:00 24.04 6/10/2009 12:00 30.11 4/10/2009 20:00 23.53 6/10/2009 13:00 32.73 4/10/2009 21:00 21.72 6/10/2009 14:00 30.11 4/10/2009 22:00 23.53 6/10/2009 15:00 30.11 4/10/2009 23:00 21.72 6/10/2009 16:00 25.32 5/10/2009 0:00 22.53 6/10/2009 17:00 29.04 5/10/2009 1:00 22.58 6/10/2009 18:00 24.56 5/10/2009 2:00 22.58 6/10/2009 19:00 23.09 5/10/2009 3:00 22.58 6/10/2009 20:00 23.95 5/10/2009 4:00 22.58 6/10/2009 21:00 23.95 5/10/2009 5:00 22.58 6/10/2009 22:00 23.53 5/10/2009 6:00 22.58 6/10/2009 23:00 23.95 5/10/2009 7:00 22.53 7/10/2009 0:00 23.53 5/10/2009 8:00 24.04 7/10/2009 1:00 23.53 5/10/2009 9:00 27.96 7/10/2009 2:00 22.53 5/10/2009 10:00 29.27 7/10/2009 3:00 22.53 5/10/2009 11:00 29.53 7/10/2009 4:00 22.58 5/10/2009 12:00 29.90 7/10/2009 5:00 21.65 5/10/2009 13:00 32.58 7/10/2009 6:00 21.65 5/10/2009 14:00 30.11 7/10/2009 7:00 22.67 5/10/2009 15:00 29.53 7/10/2009 8:00 23.68 5/10/2009 16:00 28.54 7/10/2009 9:00 24.52 5/10/2009 17:00 27.96 7/10/2009 10:00 28.21 5/10/2009 18:00 24.91 7/10/2009 11:00 27.98 5/10/2009 19:00 23.40 7/10/2009 12:00 27.88 5/10/2009 20:00 24.04 7/10/2009 13:00 27.97 5/10/2009 21:00 23.53 7/10/2009 14:00 28.90 5/10/2009 22:00 21.72 7/10/2009 15:00 28.54 5/10/2009 23:00 21.72 7/10/2009 16:00 29.20 6/10/2009 0:00 22.53 7/10/2009 17:00 29.08 6/10/2009 1:00 22.58 7/10/2009 18:00 25.45 6/10/2009 2:00 22.58 7/10/2009 19:00 22.46 6/10/2009 3:00 22.58 7/10/2009 20:00 21.65 6/10/2009 4:00 22.58 7/10/2009 21:00 21.86 6/10/2009 5:00 21.65 7/10/2009 22:00 20.56 A-6 Appendix 2. MATLAB Source Code % Neural Network Program to predicted the average air temperature from % 28 September to 7 October % Create Network numHiddenNeurons = 30; % The number of Hidden Neurons that will be used in the network net = newff(p,t,numHiddenNeurons); % Division of Samples % Allocation of the number of samples data for Training, Validation and Testing net.divideParam.trainRatio = 0.60; % Adjust as desired net.divideParam.valRatio = 0.20; % Adjust as desired net.divideParam.testRatio = 0.20; % Adust as desired % Random Seed for Reproducing NFTool Results rand('seed',26389783.000000) net = init(net); % Train Network [net,tr] = train(net,p,t); % Simulate Network [trainOutput,Pf,Af,E,trainPerf] = sim(net,p(:,tr.trainInd),[],[],t(:,tr.trainInd)); [valOutput,Pf,Af,E,valPerf] = sim(net,p(:,tr.valInd),[],[],t(:,tr.valInd)); [testOutput,Pf,Af,E,testPerf] = sim(net,p(:,tr.testInd),[],[],t(:,tr.testInd)); % Display Performance fprintf('Train vector MSE: %f\n',trainPerf); fprintf('Validation vector MSE: %f\n',valPerf); fprintf('Test vector MSE: %f\n',testPerf); % Plot Regression figure postreg({trainOutput,valOutput,testOutput}, ... {t(:,tr.trainInd),t(:,tr.valInd),t(:,tr.testInd)}); A-7