Chapter 1 - Sacramento

advertisement
NEURAL NETWORKS NON-LINEAR SCALING
Alok Bhaskar Nakate
B.E., Pune University, India, 2006
PROJECT
Submitted in partial satisfaction of
the requirements for the degree of
MASTER OF SCIENCE
in
COMPUTER SCIENCE
at
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
SPRING
2011
NEURAL NETWORKS NON-LINEAR SCALING
A Project
by
Alok Bhaskar Nakate
Approved by:
__________________________________, Committee Chair
V. Scott Gordon, Ph.D.
__________________________________, Second Reader
Kwai Ting Lan, Ph.D.
____________________________
Date
ii
Student: Alok Bhaskar Nakate
I certify that this student has met the requirements for format contained in the University format
manual, and that this project is suitable for shelving in the Library and credit is to be awarded for
the Project.
__________________________, Graduate Coordinator
Nikrouz Faroughi, Ph.D.
Department of Computer Science
iii
________________
Date
Abstract
of
NEURAL NETWORKS NON-LINEAR SCALING
by
Alok Bhaskar Nakate
Training a neural network with backpropagation algorithm is a systematic process
to model a set of given data. This training process involves, among other things, scaling
the input and output datasets provided to the neural network. The reason that the scaling
process is required, is that real world problem datasets might not be in the range [0, 1],
whereas the neural networks work with data only in the range [0, 1], i.e. the neurons fire
or they do not. To achieve this, linear scaling is typically used, which for certain datasets
can make it difficult for the neural network to properly differentiate between values that
are close together.
This project will show how the implementation of non-linear median scaling can
be applied to the training datasets and compares its performance against the linear scaling
methodology for a variety of training datasets both for speed of learning and subsequent
ability to generalize. This project demonstrates that after introduction of non-linear
iv
scaling into the backpropagation and applying it to the various datasets, there is an
improvement in performance to some extent.
_______________________, Committee Chair
V. Scott Gordon, Ph.D.
_______________________
Date
v
ACKNOWLEDGEMENTS
I would like to thank Professor Scott Gordon who helped me choose the topic for this
project and for countless times he spent with me to guide me in my project. I would also
thank him for his valuable advice for this project, and for his reviewing and providing
suggestions for the project report. I would also like to thank Professor Kwai Ting Lan for
reviewing this project report.
Finally, I would like to thank my numerous friends who endured this long process with
me, always offering support.
vi
TABLE OF CONTENTS
Acknowledgements ............................................................................................................ vi
List of Tables ..................................................................................................................... ix
List of Figures ..................................................................................................................... x
Chapter
1.
INTRODUCTION .................................................................................................. 1
2.
BACKGROUND .................................................................................................... 3
2.1. Neural Networks ..................................................................................................... 3
2.2. Artificial neuron ...................................................................................................... 7
2.3. Backpropagation algorithm ................................................................................... 11
2.3.1.
Criteria..................................................................................................... 12
2.3.2.
Learning rate ........................................................................................... 13
2.3.3.
Generalization ......................................................................................... 13
2.4. Linear scaling ........................................................................................................ 14
2.4.1.
3.
Drawback of the linear scaling ................................................................... 16
PROPOSED SOLUTION ..................................................................................... 19
3.1.
Scaling in backpropagation ......................................................................... 20
3.1.1.
Initialization ................................................................................................ 21
3.1.2.
Scaling......................................................................................................... 21
3.1.3.
Training ....................................................................................................... 23
3.1.4.
Un-scaling process ...................................................................................... 24
3.2.
Which Neural Network elements are scaled? ............................................. 26
vii
3.3.
4.
Scaling range............................................................................................... 27
EXPERIMENTAL METHODOLOGY ................................................................ 28
4.1.
Datasets ....................................................................................................... 28
4.1.1.
Minimum weight steel beam problem ........................................................ 28
4.1.2.
Two-Dimensional projectile motion datasets ............................................. 30
4.1.3.
Wine recognition data ................................................................................. 32
4.2.
5.
Measuring performance .............................................................................. 35
RESULTS ............................................................................................................. 37
5.1.
Result for minimum weight problem datasets ............................................ 37
5.2.
Result for two-dimensional projectile motion dataset ................................ 38
5.3.
Result for wine recognition data ................................................................. 39
6.
CONCLUSION ..................................................................................................... 41
7.
FUTURE WORK .................................................................................................. 42
Appendix A Source Code ............................................................................................... 43
Appendix B Datasets ...................................................................................................... 61
Appendix C Results ........................................................................................................ 66
Bibliography ..................................................................................................................... 82
viii
LIST OF TABLES
Page
Table 1 Linear scaling example ……………….….……….………...….…………….. 16
Table 2 Linear scaling with large difference in datasets ……………….…………...... 17
Table 3 Training datasets for steel beam design problem …………………………….. 29
Table 4 Testing datasets for steel beam design problem ……..………………………. 30
Table 5 Example training datasets for 2-D projectile data motion …...………………. 31
Table 6 Example testing datasets for 2-D projectile data motion ….…………………. 32
Table 7 Example training datasets for wine recognition ………….……….…………. 34
Table 8 Example testing datasets for wine recognition ……………....………………. 34
Table 9 Result for minimum weight problem …………………………………..…….. 38
Table 10 Result for 2-D projectile motion problem ………………...………………… 39
Table 11 Result for wine recognition data …………………..………………………... 40
ix
LIST OF FIGURES
Page
Figure 1 Artificial neural network…………………………..……………………….…. 5
Figure 2 Artificial neuron with activation function F……………..……………………. 7
Figure 3 Sigmoidal activation function…..……………………….…………………….. 9
Figure 4 Backpropagation algorithm……..……………………………...……………. 12
Figure 5 Neural network with large difference in input datasets…...…………………. 18
Figure 6 Formula for non-linear scaling…....…………………….…….…..…………. 20
Figure 7 Formula for linear scaling…….…..…………………………………………. 22
Figure 8 Formula for non-linear scaling in detail…....…………….………….………. 23
Figure 9 Formula for scaling criteria value..…………………………………..………. 24
Figure 10 Formula for non-linear unscaling.…………..………………………………. 25
Figure 11 Formula for linear unscaling…….………………………………….………. 26
x
1
Chapter 1
INTRODUCTION
Neural networks are biologically inspired and often capable of modeling realworld complex functions. Artificial Neural Networks are composed of elements that
perform in a manner that resembles a set of biological neurons, and organized in a way
that is inspired by the anatomy of the brain [1]. An interesting characteristic of neural
networks is that they can learn their behavior by example, i.e. when certain set of inputs
are applied, they self-adjust to produce responses consistent with the desired outputs. So,
when a set of training data is applied to the neural networks, they learn that particular
problem, so furthermore when the trained neural network comes across some unknown
inputs but of similar circumstances this designed neural network, can often respond
appropriately, and this characteristic of neural network is known as generalization [4]. To
effectively do so they need to be trained initially with training datasets which include the
inputs and the corresponding desired outputs. The neural network then tries to minimize
the error between the desired output specified in the training data and actual output
produced by the neural networks. This is often a slow process, so researchers are
interesting in finding fast training methods.
One of the well-known methods of training a neural network is Backpropagation.
Backpropagation is a systematic method for training multilayer neural networks. When a
set of inputs is applied to the neural network, the backpropagation algorithm adjusts the
weights based on the resulting error.
2
Backpropagation usually performs scaling of input (optional) and output values.
The reason for scaling the training datasets is that the artificial neurons only output in the
range [0, 1]. Real world problems often include data outside this range. Backpropagation
typically uses linear scaling before the algorithm starts its training.
However, this linear scaling approach often fails when the input or output values
are clustered and not adequately distributed. If all the datasets are scaled linearly, this can
result in training data that is clustered too closely together for the network to model.
Therefore, this project will introduce a new approach of nonlinear scaling, and test this
method on a variety of problems with different training datasets.
3
Chapter 2
BACKGROUND
2.1.
Neural Networks
Artificial neural networks are inspired by the biological neurons that reside in the
brain’s Central Nervous System. The neural network has the capability for solving real
world problems by building a computational model and processing the information
provided to it. It can be called an adaptive system that changes its structure based on the
information that flows through the network during its learning. Learning implies that
neural networks are capable of changing their input/output behavior as a result of changes
in the environment.
Neural networks learn by example. Training sets are provided to the neural
network and then by the use of training algorithms like backpropagation, they learn to
replicate it. The neural networks operate in three scenarios, which are as follows:
1. During the training process of the neural network, the datasets including both the
input and the desired output are provided. The network then adjusts the weights with
the help of the algorithm (backpropagation), so that it gets to the desired output. This
process may require iterations to reach to the desired output. This is also known as the
correct/known behavior. Therefore, the neural networks learn from this process when
provided with an appropriate training dataset producing the correct output.
2. After training, test datasets are applied to the neural network to test the above process
in order to know whether the designed network was successful in learning. The test
4
datasets also include input and the desired output. The only difference between the
first step and this step is that instead of iterating through the loop to adjust the
weights, it uses the above data (weights) to try to reach the desired output for similar
untrained cases.
3. In the third scenario, only input datasets are provided to the designed neural network
expecting that the neural network will produce the desired output since it has learnt
from the above steps. This scenario can be a real-life example, for instance
implementing the trained neural network in a field where the desired outputs are
unknown and the only data available is input data.
The main advantage of neural networks lies in their ability to represent both linear
and non-linear relationships and in their ability to learn these relationships directly from
the data being modeled.
The basic artificial neural network is shown in Figure 1 below. It consists of four
main components:
1. The input layer: All the artificial neurons accept the inputs given to the neural
networks.
2. The Output layer: All the results are produced at this layer.
3. One or more hidden layers: The computations occur here, described in section 2.2.
4. The weights: The values at each link between the nodes. Weights get adjusted in the
network according to the error calculations.
5
Figure 1: Artificial neural network
In the traditional computational (non-neural network) approach, complex
problems can be solved by “divide and conquer”. The complex problems can be
decomposed into simpler or smaller elements so that it is easier to understand them.
These simple elements which are partially solved can then be gathered together to
produce the complex systems. However, in neural networks, the solutions are distributed
amongst the neurons. Each neuron computes and produces output and passes on to
another neuron or to the output layer. Often, the neural networks can modify their
behavior in response to changes in the environment, which is similar to the working of
the brain.
6
Some advantages of neural networks are:
1. A trained neural network can become an “expert” in the category of information it
has learned.
2. Adaptive learning: They have an ability to learn how to do tasks based on the data
given for training or initial experience.
3. Self-Organization: They can create their own representation of information that
they receive during the learning process.
4. Flexibility to a large number of domains
7
2.2.
Artificial neuron
The artificial neuron is an abstraction of the first order characteristic of the
biological neuron, which represents a mathematical unit in the neural network model. We
can say that neurons form the computational model inspired by the natural biological
neuron. Figure 2 below shows the neuron used as a fundamental building block for
backpropagation algorithm. It accepts input from the datasets at the input layer or from
other neurons, adds all these inputs and multiplies them with their corresponding weights,
producing a weighted sum. This weighted summation of products also called as NET
must be calculated for each neuron in the network. After NET is calculated, it is passed
on to the activation function to determine to what degree the neuron fires, thereby
producing the signal OUT [1].
X1
W2
X2
X3
Activation
function
W1
NET = XW
F
W3
Threshold
Figure 2: Artificial neuron with activation function F
OUT
8
As shown in the above figure 2, the X1, X2 and X3 are the inputs given to the
neuron. The W1, W2 and W3 are the weights applied initially to the inputs, which are
later adjusted accordingly. Then the summation blocks accepts these inputs and weights
and calculate the weighted sum or NET and passes that on to the activation function.
Each input is multiplied by a corresponding weight and all the weighted inputs are
then summed to determine the activation level of that neuron. In figure 2 above, the “F”
block is the activation function. Furthermore, this activation function F processes NET
signal to produce the neuron’s output signal, OUT.
OUT = F (NET)
In some neural network models, F may be a threshold function
OUT = 1 if NET > T
OUT = 0 otherwise
Here, T is a constant threshold value, or a function that more accurately simulates the
non-linear characteristic of the biological neuron and permits more general network
functions [1].
However, it is more common to use a continuous function instead of a hard
threshold value. The F processing block compresses the range of NET, so that OUT never
exceeds some low limits regardless of the value of NET, and this F processing block is
therefore sometimes known as the squashing function. The squashing function is often
chosen to be a logistic function or sigmoid (S – Shaped) and mathematically represented
as:
9
OUT = F(NET) = 1 / (1 + e – NET )
As shown in the figure 3, the squashing function or the sigmoid compresses the
range of NET so that the OUT lies between zero and one. This is why the calculated
output of a neural network can lie only between zero and one. The advantage of using the
logistic function rather than a hard threshold is that the logistic function is differentiable
which led to the derivation of the backpropagation algorithm [1].
Figure 3: Sigmoidal activation function
10
In biological terms, the inputs of artificial neurons resemble the synapse in
biological neuron, which are multiplied by weights. It resembles the strength of the
respective signals in natural neuron, and is further computed by the mathematical
function that determines the activation function.
Now, the weights of the artificial neurons can be adjusted to obtain the desired
output with the specific inputs provided. So depending on the adjustment of the weights,
the summation in the computational block will be different.
Normally, the artificial neural network is not composed of a single neuron. Typical
neural networks can have as few as half a dozen or as many as hundreds or thousands of
neurons involved in complicated applications. There are various algorithms which adjust
the weights according to the desired output and this process is called training an artificial
neural network. Backpropagation algorithm is one of the popular training methods.
11
2.3.
Backpropagation algorithm
Backpropagation is a systematic method for training multilayer artificial neural
networks. It is known as a supervised learning method, which adjusts the weights when
the result that was calculated is compared against the result that was expected in the
training datasets. The difference between the produced output and the desired output is
calculated which is known as the error. It then statistically uses this information to
modify the weights and train itself to reach the expected result as quickly as possible. So,
we can say that when a set of datasets are provided to neural networks, they
systematically try to train themselves and so as to respond intelligently to the inputs.
Backpropagation algorithm is mainly applied to feed forward neural networks. As
the name suggests, the artificial neural networks send their calculations forward and the
errors are propagated backwards.
The backpropagation algorithm follows an iterative process, i.e. in each iteration
the weights of nodes are modified using data from the training data sets. The main
aim/objective of backpropagation algorithm is to reduce the error by adjusting the
weights throughout the entire network. The error is the difference between the actual
result calculated by the neural networks and the desired output. Often, the data that are
provided to the neural network are scattered. Hence, before the backpropagation executes
the learning process, these scattered data are scaled to some desired range.
The objective of training the neural network using backpropagation is to adjust
the weights so that the application of a set of inputs produces the desired set of outputs.
The input – output sets are often referred to as vectors. The training process assumes that
12
each input vector is paired with a target vector or the desired output, together these are
called as the training pair [1]. The neural network is trained over a number of training
pairs.
The backpropagation process is shown in figure 4.
Initialize the weights in the network to a small random number
I
Do
For each training pair (di , do ) in the training sets
// di is the desired input and do is the desired output.
// this is the feed forward pass
O = Neural_network_outputs (network, di)
Calculate error (do – O) at the output layer for each output in training pair
Compute delta for all weights from the hidden layer to output layer
Compute delta for all the weights from input layer to hidden layer
// above two steps are the backward pass
Update the weights in the Network
Until all datasets classified correctly and the desired criteria is satisfied
Return the set of trained network weights
Figure 4: Backpropagation algorithm
2.3.1. Criteria
This parameter indicates when the learning process should stop. All the outputs
must be within this criteria parameter to terminate.
13
2.3.2. Learning rate
This is a constant to affect the speed of the learning. The mathematical
calculations of backpropagation are based on small changes being made to the weights at
each step of the error calculations. If the changes made to the weights are too large, the
algorithm may bounce around the error. So, in this case, it is necessary to reduce the
learning rate. On the other hand, the smaller the learning rate, the more steps it takes to
get to the criteria.
2.3.3. Generalization
If you have a good training set with examples that cover most or all of the various
possible inputs and the neural networks learns them all, then it is likely to generalize and
successfully model other similar instances. This means that it will give the correct output
for other inputs of the same application.
14
2.4.
Linear scaling
Backpropagation algorithm is a supervised learning method wherein a set of
inputs datasets are applied to the networks to train them and then applying the test
datasets to see how well they learned. Furthermore, if the network learns them all and
some unknown set of input data are applied to the network, they hopefully generalize.
Often, the input and output datasets are not uniform, i.e. they do not have a fixed
range, and certainly are rarely limited to the range [0, 1]. These datasets are usually
distributed across a large range of values. Due to this large distribution of input datasets,
scaling becomes essential. The reason this type of scaling process is needed is because
the input and output values provided to the neural network can be any random number,
whereas the neurons can output data only in the range of [0, 1], i.e. they fire or they do
not fire. Hence scaling the datasets is necessary.
Hence, the linear scaling is applied to output values (training datasets) and
optionally to the input datasets, that are given to the neural networks, to bring all the
datasets in the desired range [0, 1], before the backpropagation starts the learning process.
15
The formula for applying the linear scaling to each input and output training
datasets provided to the Neural Networks is:
Xi =
(X – Xmin)
____________
(Xmax – Xmin)
Where,
X = the value which is being scaled.
Xi = the scaled value for each dataset values (X)
Xmin = the minimum value of that particular input which is being scaled.
Xmax = the maximum value of that particular input which is being scaled.
Table 1 shows an example dataset and the scaled result. The first column is the
original training dataset and the second column is the scaled value for each left hand side
value.
The scaled values will be between range [0.050, 0.95], instead of the theoretical
range [0, 1]. According to the formula used in sigmoidal function and as shown in figure
3, it approaches only 0 or 1, but never actually reaches 0 or 1, hence the linear scaling
formula is modified to scale all the datasets from 0.05 to 0.95. As shown in the table 1, in
the first column the smallest value is 11.02 hence its scaled value is 0.050 and the largest
value is 14.83, hence its scaled value is 0.95.
16
Table 1: Linear scaling example
Datasets provided to Neural Network
Linear Scaled values
14.23
0.807
13.19
0.563
14.36
0.841
13.24
0.573
14.19
0.800
14.39
0.845
14.06
0.765
14.83
0.95
11.02
0.050
2.4.1. Drawback of the linear scaling
For some training datasets that have large differences in them, scaling those
values that are close together will bring their differences near to zero. The motivation of
using backpropagation for training neural networks is to reduce the error between the
calculated output by the network and the desired output in the datasets. However, if the
difference between some of the various input/output pairs provided to neural network is
too small to make them harder to distinguish, the further weight manipulations in
17
backpropagation becomes tougher and hence requires much more iteration to achieve the
desired output.
To demonstrate this, let us consider the datasets in table 2. After applying the
linear scaling formula to each of the values, we get the following scaled values.
Table 2: Linear Scaling with large difference in datasets
Datasets provided to Neural Network
Linear Scaling values
0.25
0.00016
0.15
0.050
0.75
0.0011
0.99
0.0014
600
0.95
430
0.7165
598
0.9966
According to the linear scaling formula,
X = 0.15
Xmin = 0.15
Xmax = 600
The Xi = 0.05
For X = 0.75, the Xi = 0.0011
In addition, for X = 600, the Xi = 0.95.
18
Figure 5: Neural networks with large difference in input datasets
In the figure 5, the width of the arrows represents the amount of weights applied
to each input datasets. The nodes with 0.15 values are thicker and the nodes with value
600 are thinner. From the above example, the difference between the datasets 0.15 and
600 is large, so the smaller values like 0.15, 0.75 are all scaled close to zero and this
adversely affects the ability of backpropagation to find weights that differentiate between
them.
This project explores whether this drawback can be reduced by applying nonlinear scaling to the input/output datasets, i.e. scale the larger datasets to a smaller range
and the smaller datasets to a larger range.
19
Chapter 3
PROPOSED SOLUTION
One way of making the scaling process non-linear is to scale the datasets
according to the medians for each dataset inputs and outputs. This approach might help
resolve the scaling issue for the datasets with large difference, as this scaling approach
will scale at a lesser amount for the largely dense values and scale in large amount for the
less dense datasets values. The formula for median type scaling will be as shown in figure
6.
First, we find out the median, minimum and maximum for each datasets, i.e.
finding these three values for each column in the training datasets provided to the neural
networks. Then, for each dataset value apply the given formula and linearly scale the
right and left halves of the data separately. After the right half of the formula computes
the scaling value that fit in the right hand side of the median, 0.5 value is added to it the
computed value, to shift the value to the right hand side, as shown in figure 6. If seen
closely each sub-portion of the sides (values that lie in left hand side and the values that
lie in right hand side) are actually scaled linearly. However, the overall classification of
the datasets occurs non-linearly according to the median value and the algorithm scales
them accordingly.
20
Figure 6: Formula for non-linear scaling
3.1. Scaling in backpropagation
The backpropagation algorithm is classified into three major steps. They are as
follows:
1. Initialization: In this procedure, the neural network is defined and all the necessary
parameters are set. The weights are initially set to small, random values.
2. Scaling: This process applies the scaling formula as shown in figure 6, to the datasets
provided to neural network to bring all the datasets into the desired range [0, 1].
3. Training: This process includes the forward pass and error correction procedure for
reaching the desired output. This process iterates until all the datasets are classified
correctly and the desired criterion is reached. (Described in figure 4.)
4. Un-scaling: This process includes scaling back the calculated output data by the
backpropagation algorithm in step 2 and the actual inputs that were scaled.
21
3.1.1. Initialization
1. Set training datasets and test datasets, i.e. initialize the number of rows and number of
columns in the datasets.
2. Design the neural network by defining the network topology, i.e. the number of nodes
with input, output and hidden layers.
3. Initialize the backpropagation parameters like criteria, learning rate.
4. Initialize the weights to some small random values.
3.1.2. Scaling
1. Initialize the extreme[] array. The extreme[] array holds the 3 values – minimum,
maximum, and the median values for each input and output datasets.
2. Open the Training case file and read all the data to an array Train[][]
3. For each column in array Train[][]
a. Find out the minimum and maximum values and insert them in extreme[0] and
extreme[1] array.
4. For each input datasets
a. Scale down only the input datasets except the output datasets in the array Train[][]
with the following linear scaling formula.
i.e. for each dataset in the particular column apply the formula shown in figure 7
and this will scale them in the range [0 – 1].
22
Figure 7: Formula for linear scaling
5. Store the output dataset columns in a temporary array MedianArray[]
6. Sort the MedianArray[] to find the median.
7. For each output datasets
a. Calculate the median value and insert it in array extreme[2].
8. For each output datasets
a. Scale down the output datasets with the three values in extreme[] array with the
non-linear median type scaling algorithm shown in figure 8 and insert each scaled
value in Train[][] array.
23
Figure 8: Formula for non-linear scaling in detail
3.1.3. Backpropagation algorithm
1. For each data training pair (di , do) in the training sets
a. Process the feed forward pass
b. For each output in training pair, calculate the error at the output layer
c. Scale the criteria value for each corresponding calculated output values. The
formula for scaling the criteria value is scaled to the range [0,.9], not [0.05, 0.95]:
24
Figure 9: Formula for scaling the criteria value
d. Process the backward pass
i.
Compute delta for all weights from the hidden layer to output layer.
ii.
Compute delta for all the weights from input layer to hidden layer
e. Update the weights in the network in a way that minimizes the error.
2. Repeat Step 1 for all datasets until they are classified correctly and within the desired
criteria.
3. Return the set of trained network weights.
3.1.4. Un-scaling process
1.
After the weight manipulations process executes, and the outputs lie within the
desired criteria or the network reaches the number of defined iterations, the
25
dataset values need to be scaled back to the original value/range. The formula for
reversing the scaling process is as shown in figure 10.
i. For non-linear scaling back to original range:
Figure 10: Formula for non-linear unscaling
26
ii. For Linear scaling back to original range:
Figure 11: Formula for linear unscaling
2. Print the calculated Output
3. Exit the program
3.2.
Which Neural Network elements are scaled?
In the above algorithm, the non-linear median type scaling is applied to the output
dataset only and the input dataset are scaled according to the linear formula and then
these results are compared against the fully linear scaling process applied to the same
datasets and observed the performance of the neural network. The non-linear scaling of
the datasets is applied to only the output datasets, but scaling the inputs in non-linear
would also be a useful experiment.
27
A criterion is a parameter used by the backpropagation algorithm to find out the
termination point. Every calculated output error is compared with the criteria to see if
training has succeeded. In the algorithm, we need to scale the criteria value, as all the
input and output datasets have been scaled respectively, so when comparisons are made,
all the data values need to be within to the same range. Hence, the criteria value for each
calculated output is also scaled non-linearly.
3.3. Scaling range
All the formulas described above in the algorithm are theoretical formulas, so for
the implementation purpose, the algorithm needs to be refined to ensure that the error
correction calculation is correct. For instance, in the scaling formula if the difference of
Xmin and Xmax value comes to zero, the algorithm will encounter divide-by zero error
during the actual implementation. Therefore, all the formulas need to be adjusted. As
shown in all the figures above in the backpropagation algorithm, all the datasets are
scaled (linearly or non-linearly) in the range [0.05, 0.95]. The purpose for doing so is that
the logistic output used in the backpropagation cannot actually output a zero or a one, but
it approaches these values (Please refer to section 2.2 for logistic function explanation).
In the median measurement formula, the range from [Median – Maximum] the
computed value falls in range [Minimum – Median], so the formula has to be refined so
that the computed value in the range [Median – Maximum] shifts right to 0.5 value
(Please refer to Appendix A for the formula implementations).
28
Chapter 4
EXPERIMENTAL METHODOLOGY
The effectiveness of non-linear scaling will be assessed by applying different
problem sets to a particular neural network under normal scaling techniques (linearly)
and then apply the same problem sets with non-linear scaling technique, and measuring
how well the neural network does at learning the data and generalizing.
In this section, descriptions of each of the datasets is provided.
4.1.
Datasets
Each dataset has one training pair datasets and one testing pair dataset, which are
applied to the designed neural network. The table shows only some of the training pairs,
as in some cases all the training dataset is too large to be included in this document.
The above-discussed backpropagation algorithm is applied to three problem datasets.
4.1.1. Minimum weight steel beam problem
In this dataset, the artificial neural network will be used for learning in the domain
of structural engineering. This dataset is an acceptable design that satisfies the
requirements of a design code for AISC LRFD specification of steel structures for
designing concrete structures. This dataset is specifically of a minimum weight steel
beam from the wide-flange (W) shape database for a given loading condition [3]. The
designed artificial neural network will be used to learn to select the lightest W shape
among all the available shapes. Each instance of the training dataset consists of five input
patterns:
29

The member length (L)

The unbraced length (L b)

The maximum bending momentum in the member (M max)

The maximum shear force (V max)
Each instance of the training datasets also includes the following corresponding output
pattern:

The plastic module of the corresponding least weight member (Z x)
Backpropagation uses the following parameter values:
Learning rate = 0.3
Criteria = 0.001
Table 3: Training datasets for steel beam design problem
Inputs
Output
Instance
L
Lb
M max
V max
Zx
1
0.40
0.40
0.190
0.190
0.6313
2
0.20
0.20
0.120
0.240
0.3630
3
0.35
0.35
0.035
0.035
0.1000
4
0.15
0.15
0.045
0.120
0.1400
5
0.15
0.15
0.035
0.095
0.1000
30
Table 4: Testing datasets for steel beam design problem
Inputs
Outputs
Instance
L
Lb
M max
V max
Zx
1
0.20
0.20
0.030
0.060
0.1000
2
0.30
0.30
0.095
0.127
0.3900
3
0.15
0.15
0.010
0.027
0.0415
4
0.40
0.40
0.120
0.120
0.4830
4.1.2. Two – Dimensional projectile motion datasets
This dataset includes the two – dimensional projectile motion of a ball fired from a gun
with particular specifications. These datasets were gathered by keeping the gun at some
stationary platform and hitting a stationary target. The gun has the ability to swivel
anywhere from 0 to 90 degrees on the Z axis, and the angle that is formed will be
elevation angle theta. The gun can swivel 360 degrees about the origin of X – Y plane,
the angle that is formed between the X-axis and where the gun is pointing is called the
azimuthal angle phi. Keeping wind into consideration, in the X-Y plane resulting in the
wind's own azimuthal angle alpha. Finally, the wind, if it is blowing, imparts some
acceleration to the projectile and we will refer to this acceleration as “a”. Hence, each
instance of the training dataset consists of five input patterns:
31

The initial velocity of projectile (V o)

The elevation angle of gun ( Θ )

Azimuthal angle of the gun (Φ)

The horizontal acceleration of projectile due to wind (a)

Azimuthal angle of wind accelerated vector (α)
Each instance of the training dataset consists of two output patterns:

The target co-ordinates (Xt, Yt, 0)

The projectile impact co-ordinates (Xi, Yi, 0)
Table 5: Example training datasets for 2-D projectile data motion
Instance
Inputs
outputs
Vo
Θ
Φ
a
α
Target
Impact
1
-185.680
62.3995
33.0
7.0
3.203
57.8742
136.039
2
46.4323
-21.428
40.0
0.0
0.8099
9.1267
335.227
3
-197.269
80.5293
49.0
0.0
3.791
30.211
157.793
4
-45.512
-66.324
31.0
0.0
1.0952
62.4432
235.541
5
130.623
-164.342
49.0
3.0
0.0
68.9361
270.0
6
61.9628
42.6237
31.0
2.0
0.6027
22.502
34.523
7
32.177
-3.891
35.0
0.0
5.7311
7.5142
353.103
32
Table 6: Example testing datasets for 2-D projectile data motion
Instance
Inputs
outputs
Vo
Θ
Φ
a
α
Target
Impact
1
-136.583
10.491
39.0
7.0
1.928
26.795
194.710
2
34.339
210.004
49.0
1.0
4.509
34.542
80.5470
3
82.987
-124.332
32.0
9.0
4.930
33.781
316.559
4
-48.627
40.255
35.0
9.0
0.5604
37.4732
182.333
5
153.010
174.322
46.0
4.0
1.3616
73.017
7.921
6
-165.138
119.676
37.0
4.0
2.754
53.303
136.59
Backpropagation uses the following parameter values:
Learning rate = 0.3
Criteria = 1
4.1.3. Wine recognition data
The wine recognition datasets are the result of a chemical analysis of wines grown
in the same region in Italy, but derived from three cultivars. This chemical analysis
determined the quantities of 13 constituents found in each of the three types of wines.
The output is classified into three categories mainly according to the Class they fit in.
Each instance of the training dataset consists of five input patterns:
33

Alcohol

Malic Acid

Ash

Alkalinity of ash

Magnesium

Total phenols

Flavonoids

Non-Flavonoid phenols

Proanthocyanins

Color Intensity

Hue

OD280/OD315 of diluted wines

Proline
Each instance of the training dataset consists of one output pattern:

Class distribution (1, 2, or 3)
34
Table 7: Example training datasets for wine recognition
N
Input
O
1
14.2
1.71 2.43 15.6 127 2.8 3.06 .28 2.2 5.6 1.04 3.9
1065
1
2
13.2
1.78 2.14 11.2 100 2.6 2.76 .26 1.2 4.3 1.05 3.4
1050
1
3
12.3
.94
1.36 10.6 88
.28 .42 1.9 1.05 1.8
520
2
4
12.3
1.1
2.28 16
101 2.0 1.09 .63 .41 3.2 1.25 1.6
680
2
5
12.8
1.35 2.32 18
122 1.5 1.25 .21 .94 4.1 .76
1.2
630
3
6
12.8
2.31 2.4
98
1.3
560
3
24
1.9 .57
1.1 1.09 .27 .83 5.7 .66
Table 8: Example testing datasets for wine recognition
N
Input
O
1
14
1.95
2.5
16.8
113 3.8 3.49 .24 2.1 7.8 .86
3.45 1480 1
2
12
1.81
2.2
18.8
86
2.2 2.53 .26 1.7 3.9 1.1
3.14 714
2
3
13
3.9
2.3
21.5
113 1.4 1.39 .34 1.1 9.4 .57
1.33 550
3
Backpropagation uses the following parameter values:
Learning rate = 0.3
Criteria = 0.1
35
4.2. Measuring performance
The main objective of this study is to improve the scaling process of the
backpropagation algorithm, so that it can do the error correction as fast as it can and
reach the desired output in the least number of iterations. So to compare between the
linear scaling process and the non-linear scaling process using the backpropagation for
both tests, we have to study the main factors of the algorithm, like the speed of learning,
and how well the designed neural network generalizes.
We set the learning rate and other parameters before we apply the scaling, and
then keep these values intact and then observe the error calculations done by linear
scaling process and then non-linear scaling process for each particular dataset.
Another factor that needs to be observed is the criteria value for each particular
dataset that is being tested. Keeping the criteria value too low might turn the neural
network calculation into infinite or too much iteration, and the comparison between the
linear and non-linear scaling would not conclude or would not make sense. On the other
hand, a huge criteria value would allow training to conclude before learning is complete.
Therefore, we need to observe the results with trial and error method, for the best criteria
value for a particular dataset. For example, in the projectile datasets, the training pair is
dispersed along a large range, one value is -150 and the other value is 220, so keeping the
criteria too low, the neural network will have to undergo too many iterations. However,
the datasets which do not have a large distribution in their input values, we can apply the
criteria value as low as 0.001. Changing the criteria value for each test and for each
datasets, makes a significant difference in the learning process.
36
Another important observation for the comparison would be to study the
difference between calculated outputs versus the corresponding desired outputs by the
human eye observations. After the proposed number of iterations, the backpropagation
algorithm terminates, even if the results were not meet within the specified criteria. For
such results, the minute details in the error difference in the generated output sets must be
observed.
After the learning process finishes calculating the weights, they are applied to the
test datasets. During this test, we need to observe how well they learned for both scaling
methods. During this test execution, the algorithm does not undergo any iteration for its
weight manipulations, it simply uses is the calculated weights for determining the output.
37
Chapter 5
RESULTS
After the parameters required by backpropagation algorithm for non-linear scaling
have been set properly, the datasets are applied to the designed artificial neural network.
Both the datasets, i.e. the training datasets and the testing datasets are given as input to
the algorithm. Complete outputs for training runs are given in Appendix C.
5.1.
Result for minimum weight problem datasets
After executing the backpropagation method for linear scaling, the network did
not converge the calculation within the declared iteration value, which were 900000
iterations. On the other hand, for the non-linear scaling execution of the algorithm, the
network converged earlier than the number of iterations that was defined. As discussed
earlier, the number of iterations of the neural network depends on the criteria value, both
the execution had the same criteria value. The neural network learnt the dataset within
359299 iterations, whereas in linear scaling method, the network did not converge.
Furthermore, this methodology was tested using the test datasets. Observing the outputs
for test datasets, the network applied the weights that were adjusted during the learning
phase. The percentage of the testing cases that meet the criteria for non-linear scaling is
100, so that all the outputs that were generated by the neural network were within the
desired criteria.
38
Looking at the results of backpropagation for minimum weight problem dataset,
we can conclude that, for the non-linear scaling, the network learned faster than when the
datasets were scaled linearly.
Table 9: Result for minimum weight problem
Operation
Linear Scaling
Total number of iterations
Non-Linear Scaling
900000
Converged
Network did not converge
Percentage Generalization
for training cases
Percentage Generalization
for test cases
80%
Network converged in
359299 iterations
100%
83.33%
100%
5.2.
Result for two-dimensional projectile motion dataset
For this dataset problem, in both the cases i.e. linear scaling and the non-linear
scaling, the network does not converge within 90000000 numbers of iterations.
Therefore, it is hard for us to tell which one did better. However, if we observe the
outputs that were generated in non-linear scaling method, there is very little improvement
in the difference between the desired output and the calculated output. For example, from
the output (Refer to Appendix C for result), the desired result is supposed to be 59.3108
and the execution in linear scaling processes 64.2965, whereas the non-linear scaling
processes 62.1667, so it is clear that 62.1667 is closer to the desired value than 59.3108.
39
Table 10: Result for 2-D projectile motion problem
Operation
Linear Scaling
Total number of iterations
Non-Linear Scaling
90000000
Converged
Network did not converge
Network did not converge
Percentage Generalization
for training cases
Percentage Generalization
for test cases
4.84%
12.168%
22.22%
15.556%
5.3.
Result for wine recognition data
We apply the wine recognition problem to the backpropagation algorithm for
linear and non-linear scaling methodology. This particular problem helps us to identify
that the non-linear scaling is better than the linear scaling method. For the linear scaling
method, the number of datasets that meet the criteria for training cases is 99.4382 on the
other hand for the non-linear scaling method, it is 100.00 percent. The number of
iterations for linear scaling methodology is less than the non-linear scaling. This implies
that the linear scaling methodology learnt faster than non-linear, but did not learn
efficiently. Therefore, the output generated by the algorithm in non-linear method was
more efficient than the output generated by the algorithm in linear method.
40
Table 11: Result for wine recognition data
Operation
Linear Scaling
Total number of iterations
Converged
Percentage Generalization
for training cases
Percentage Generalization
for test cases
Non-Linear Scaling
9000000
Network converged in
275049 iterations
99.4382%
Network converged in
306377 iterations
100%
100%
100%
41
Chapter 6
CONCLUSION
Artificial Neural Networks learn by example, and with good training datasets,
they achieve the learning process faster. The current linear scaling process used in
backpropagation fails for large differences and clustering of the values in training
datasets. In real world problems, the data that the neural network uses is often scattered
along a large scale. If this distributed data is scaled linearly, the learning effort of the
neural network may increase.
A new method of non-linear scaling called median scaling, tries to lessen the
clustering issues in the datasets. The algorithm takes all the input, output datasets at once,
and finds out the number of inputs and number of training cases. Further, for each
training pair in the datasets, the algorithm finds out three values: minimum, maximum
and the median and scales them.
After undergoing the experiment of the non-linear scaling (median type scaling)
and applying various datasets, we can say that the non-linear scaling improves the error
calculation up to some extent. Therefore, the non-linear scaling is one of the processes to
decrease the training effort. The linear scaling can work for a set of data where the range
is not dispersed rather they are tightly coupled.
42
Chapter 7
FUTURE WORK
There is a lot of scope for improving the backpropagation algorithm in terms of
scaling the datasets. For instance, this experiment applies non-linear scaling to the output
datasets, further we could apply non-linear scaling to the input datasets too and try to
study the error calculations.
Another approach for non-linear scaling would be to perform clustering analysis
on the training datasets. Each set of populated datasets can be clustered and then can be
classified into groups of datasets. Then we can split the points into multiple sections and
then apply scaling. So, instead of finding out the median and then scaling the datasets, we
find multiple points according to the clustering analysis and then scale each of them.
43
APPENDIX A
Source Code
/**************************************************
Neural Network with Backpropagation using Non-Linear Scaling methodology
-------------------------------------------------Modified by Alok Nakate
California State University Sacramento
Date: October 2010
Change: Non-Linear Scaling
-------------------------------------------------Adapted from D. Whitley, Colorado State University
Modifications by S. Gordon
-------------------------------------------------Version 3.0 - October 2009 - includes momentum
-------------------------------------------------compile with g++ nn.c
****************************************************/
#include <iostream>
#include <fstream>
#include <cmath>
using namespace std;
#define NumOfCols
3
#define NumOfRows
#define NumINs
/* number of layers +1 i.e, include input layer */
14
13
#define NumOUTs
1
0.01
/* most books suggest 0.3
*/
/* all outputs must be within this to terminate */
#define MaxIterate 1000000 /* maximum number of iterations
#define ReportIntv 1001
0.8
#define TrainCases 178
*/
/* all outputs must be within this to terminate */
#define TestCriteria 0.02
#define Momentum
*/
/* number of outputs, not including bias node
#define LearningRate 0.3
#define Criteria
/* max number of rows net +1, last is bias node */
/* number of inputs, not including bias node
*/
/* print report every time this many cases done*/
/* momentum constant
/* number of training cases
*/
*/
44
#define TestCases
15
/* number of test cases
*/
// network topology by column -----------------------------------#define NumNodes1 14
/* col 1 - must equal NumINs+1
#define NumNodes2 14
/* col 2 - hidden layer 1, etc.
*/
*/
#define NumNodes3 1
/* output layer must equal NumOUTs */
#define NumNodes4 0
/*
#define NumNodes5 0
/* note: layers include bias node */
#define NumNodes6
*/
0
#define TrainFile
"winetrain.dat" /* file containing training data */
#define TestFile
"winetest.dat" /* file containing testing data */
int NumRowsPer[NumOfRows]; /* number of rows used in each column incl. bias */
/* note - bias is not included on output layer */
/* note - leftmost value must equal NumINs+1
/* note - rightmost value must equal NumOUTs
double TrainArray[TrainCases][NumINs + NumOUTs];
// an array for finding the median for a particular column
// (from the given inputs and outputs)
double MedianArray[TrainCases];
double TestArray[TestCases][NumINs + NumOUTs];
int CritrIt = 2 * TrainCases;
ifstream train_stream;
/* source of training data */
ifstream test_stream;
/* source of test data
*/
ofstream result_stream("result.txt");
void CalculateInputsAndOutputs ();
void TestInputsAndOutputs();
void TestForward();
double ScaleOutput(double X, int which);
// Alok
double ScaleOutputGeneralise(double X, int which);
double ScaleDown(double X, int which);
// Alok
double ScaleDownGeneralise(double X, int which);
*/
*/
45
// Alok
double ScaleCriteria(double ActualOutput, int which);
double ScaleTestCriteria(double ActualOutput, int which);
void GenReport(int Iteration);
void TrainForward();
void FinReport(int Iteration);
void DumpWeights();
// Alok
double * sort(double arr[], int numR);
void quicksort(int arr[], int low, int high);
double FindMedian(double arr[], int numR);
double getMedian(double arr[], int numR);
//
struct CellRecord
{
double Output;
double Error;
double Weights[NumOfRows];
double PrevDelta[NumOfRows];
};
struct CellRecord CellArray[NumOfRows][NumOfCols];
double Inputs[NumINs];
double DesiredOutputs[NumOUTs];
double extrema[NumINs+NumOUTs][3]; // [0] is low, [1] is hi, [2] is median
long Iteration;
/************************************************************
Get data from Training and Testing Files, put into arrays
The scaling process also occurs in this step.
*************************************************************/
void GetData()
{
for (int i=0; i < (NumINs+NumOUTs); i++)
{ extrema[i][0]=99999.0; extrema[i][1]=-99999.0; }
46
// read in training data
train_stream.open(TrainFile);
for (int i=0; i < TrainCases; i++)
{ for (int j=0; j < (NumINs+NumOUTs); j++)
{ train_stream >> TrainArray[i][j];
if (TrainArray[i][j] < extrema[j][0]) extrema[j][0] = TrainArray[i][j];
if (TrainArray[i][j] > extrema[j][1]) extrema[j][1] = TrainArray[i][j];
}}
train_stream.close();
// read in test data
test_stream.open(TestFile);
for (int i=0; i < TestCases; i++)
{ for (int j=0; j < (NumINs+NumOUTs); j++)
{ test_stream >> TestArray[i][j];
if (TestArray[i][j] < extrema[j][0]) extrema[j][0] = TestArray[i][j];
if (TestArray[i][j] > extrema[j][1]) extrema[j][1] = TestArray[i][j];
}}
// guard against both extrema being equal
for (int i=0; i < (NumINs+NumOUTs); i++)
if (extrema[i][0] == extrema[i][1]) extrema[i][1]=extrema[i][0]+1;
test_stream.close();
// scale training and test data to range 0..1
/**********************************************************************
Apply Scaling to Training cases
**********************************************************************/
for (int i=0; i < TrainCases; i++)
{
for (int j=0; j < NumINs; j++)
TrainArray[i][j] = ScaleDown(TrainArray[i][j],j);
}
// Alok
// Find the Median of a particular column
// Currently it finds the medians of all the output columns.
47
for (int k=NumINs; k < NumINs+NumOUTs; k++)
{
for (int i=0; i < TrainCases; i++)
{
MedianArray[i] = TrainArray[i][k];
}
extrema [k][2] = FindMedian(MedianArray, TrainCases);
}
// Apply the NON-LINEAR Scaling using the median
for (int i=0; i < TrainCases; i++)
{
for (int k=NumINs; k < NumINs+NumOUTs; k++)
TrainArray[i][k] = ScaleDownGeneralise(TrainArray[i][k],k);
}
/**********************************************************************
Apply Scaling to Test cases
**********************************************************************/
for (int i=0; i < TestCases; i++)
{
for (int j=0; j < NumINs; j++)
TestArray[i][j] = ScaleDown(TestArray[i][j],j);
for (int k=NumINs; k < NumINs+NumOUTs; k++)
TestArray[i][k] = ScaleDownGeneralise(TestArray[i][k],k);
}
}
double FindMedian(double arr[], int numR)
{
double valMedian;
arr = sort (arr, numR);
valMedian = getMedian(arr, numR);
return valMedian;
}
/***************************************************************
Function to find the median of a particular sorted column
48
***************************************************************/
double getMedian(double arr[], int numR)
{
int middle = numR/2;
double average;
if (numR%2==0)
average = (arr[middle-1]+arr[middle])/2;
else
average = (arr[middle]);
return average;
}
/***************************************************************
Function to sort a particular column
***************************************************************/
double * sort(double arr[], int numR)
{
double temp;
for (int i = (TrainCases - 1); i >= 0; i--)
{
for (int j = 1; j <= i; j++)
{
if (MedianArray[j-1] > MedianArray[j])
{
temp = MedianArray[j-1];
MedianArray[j-1] = MedianArray[j];
MedianArray[j] = temp;
}
}
}
return arr;
}
49
/**************************************************************
Assign the next training pair
***************************************************************/
void CalculateInputsAndOutputs()
{
static int S=0;
for (int i=0; i < NumINs; i++) Inputs[i]=TrainArray[S][i];
for (int i=0; i < NumOUTs; i++) DesiredOutputs[i]=TrainArray[S][i+NumINs];
S++;
if (S==TrainCases) S=0;
}
/**************************************************************
Assign the next testing pair
***************************************************************/
void TestInputsAndOutputs()
{
static int S=0;
for (int i=0; i < NumINs; i++) Inputs[i]=TestArray[S][i];
for (int i=0; i < NumOUTs; i++) DesiredOutputs[i]=TestArray[S][i+NumINs];
S++;
if (S==TestCases) S=0;
}
/************************* MAIN *************************************/
void main()
{
int I, J, K, existsError, ConvergedIterations=0;
long seedval;
double Sum, newDelta, scaledCriteria;
Iteration=0;
NumRowsPer[0] = NumNodes1; NumRowsPer[3] = NumNodes4;
NumRowsPer[1] = NumNodes2; NumRowsPer[4] = NumNodes5;
NumRowsPer[2] = NumNodes3; NumRowsPer[5] = NumNodes6;
50
/* initialize the weights to small random values. */
/* initialize previous changes to 0 (momentum). */
seedval = 555;
srand(seedval);
for (I=1; I < NumOfCols; I++)
for (J=0; J < NumRowsPer[I]; J++)
for (K=0; K < NumRowsPer[I-1]; K++)
{ CellArray[J][I].Weights[K] = 2.0 * ((double)((int)rand() % 100000 / 100000.0)) - 1.0;
CellArray[J][I].PrevDelta[K] = 0;
}
GetData(); // read training and test data into arrays
cout << endl << "Iteration
result_stream << "Iteration
Inputs
Inputs
cout << "Desired Outputs
";
";
Actual Outputs" << endl;
result_stream << "Desired Outputs
Actual Outputs" << endl;
// ------------------------------// beginning of main training loop
do
{ /* retrieve a training pair */
CalculateInputsAndOutputs();
for (J=0; J < NumRowsPer[0]-1; J++) CellArray[J][0].Output = Inputs[J];
/* set up bias nodes */
for (I=0; I < NumOfCols-1; I++)
{ CellArray[NumRowsPer[I]-1][I].Output = 1.0;
CellArray[NumRowsPer[I]-1][I].Error = 0.0;
}
/**************************
* FORWARD PASS
*
**************************/
/* hidden layers */
for (I=1; I < NumOfCols-1; I++)
for (J=0; J < NumRowsPer[I]-1; J++)
51
{ Sum = 0.0;
for (K=0; K < NumRowsPer[I-1]; K++)
Sum += CellArray[J][I].Weights[K] * CellArray[K][I-1].Output;
CellArray[J][I].Output = 1.0 / (1.0+exp(-Sum));
CellArray[J][I].Error = 0.0;
}
/* output layer */
for (J=0; J < NumOUTs; J++)
{ Sum = 0.0;
for (K=0; K < NumRowsPer[NumOfCols-2]; K++)
Sum += CellArray[J][NumOfCols-1].Weights[K]
* CellArray[K][NumOfCols-2].Output;
CellArray[J][NumOfCols-1].Output = 1.0 / (1.0+exp(-Sum));
CellArray[J][NumOfCols-1].Error = 0.0;
}
/**************************
* BACKWARD PASS
*
**************************/
/* calculate error at each output node */
for (J=0; J < NumOUTs; J++)
CellArray[J][NumOfCols-1].Error =
DesiredOutputs[J]-CellArray[J][NumOfCols-1].Output;
/* check to see how many consecutive oks seen so far */
existsError = 0;
for (J=0; J < NumOUTs; J++)
{
// Alok
// Apply non linear scaling to the criteria too as we applied it to
// the outputs initially
scaledCriteria = ScaleCriteria(CellArray[J][NumOfCols-1].Output, NumINs+J);
if (fabs(CellArray[J][NumOfCols-1].Error) > scaledCriteria)
{
existsError = 1;
}
}
52
if (existsError == 0) ConvergedIterations++;
else ConvergedIterations = 0;
/* apply derivative of squashing function to output errors */
for (J=0; J < NumOUTs; J++)
CellArray[J][NumOfCols-1].Error =
CellArray[J][NumOfCols-1].Error
* CellArray[J][NumOfCols-1].Output
* (1.0 - CellArray[J][NumOfCols-1].Output);
/* backpropagate error */
/* output layer */
for (J=0; J < NumRowsPer[NumOfCols-2]; J++)
for (K=0; K < NumRowsPer[NumOfCols-1]; K++)
CellArray[J][NumOfCols-2].Error = CellArray[J][NumOfCols-2].Error
+ CellArray[K][NumOfCols-1].Weights[J]
* CellArray[K][NumOfCols-1].Error
* (CellArray[J][NumOfCols-2].Output)
* (1.0-CellArray[J][NumOfCols-2].Output);
/* hidden layers */
for (I=NumOfCols-3; I>=0; I--)
for (J=0; J < NumRowsPer[I]; J++)
for (K=0; K < NumRowsPer[I+1]-1; K++)
CellArray[J][I].Error =
CellArray[J][I].Error
+ CellArray[K][I+1].Weights[J] * CellArray[K][I+1].Error
* (CellArray[J][I].Output) * (1.0-CellArray[J][I].Output);
/* adjust weights */
for (I=1; I < NumOfCols; I++)
for (J=0; J < NumRowsPer[I]; J++)
for (K=0; K < NumRowsPer[I-1]; K++)
{ newDelta = (Momentum * CellArray[J][I].PrevDelta[K])
+ (LearningRate * CellArray[K][I-1].Output * CellArray[J][I].Error);
CellArray[J][I].Weights[K] = CellArray[J][I].Weights[K] + newDelta;
CellArray[J][I].PrevDelta[K] = newDelta;
}
53
GenReport(Iteration);
Iteration++;
} while (!((ConvergedIterations >= CritrIt) || (Iteration >= MaxIterate)));
// end of main training loop
// -------------------------------
FinReport(ConvergedIterations);
TrainForward();
TestForward();
}
double ScaleCriteria(double ActualOutput, int which)
{
double range, allPos;
if (ActualOutput < 0.5)
{
range = (extrema[which][2]-extrema[which][0]);
allPos = ((.9*(Criteria/range)))/2;
}
else
if (ActualOutput >= 0.5)
{
range = (extrema[which][1]-extrema[which][2]);
allPos = (((.9*(Criteria/range)))/2)+0.5;
}
return (allPos);
}
double ScaleTestCriteria(double ActualOutput, int which)
{
double range, allPos;
if (ActualOutput < 0.5)
{
54
range = (extrema[which][2]-extrema[which][0]);
allPos = ((.9*(TestCriteria/range)))/2;
}
else
if (ActualOutput >= 0.5)
{
range = (extrema[which][1]-extrema[which][2]);
allPos = (((.9*(TestCriteria/range)))/2)+0.5;
}
return (allPos);
}
/*******************************************
Scale Desired Output to 0..1
*******************************************/
double ScaleDown(double X, int which)
{
double allPos;
allPos = .9*(X-extrema[which][0])/(extrema[which][1]-extrema[which][0])+.05;
return (allPos);
}
/************************************************
This function scales the input non-linear wise.
*************************************************/
double ScaleDownGeneralise(double X, int which)
{
double range, allPos;
if (X < extrema[which][2])
{
range = (extrema[which][2]-extrema[which][0]);
allPos = ((.9*((X-extrema[which][0])/range))+.05)/2;
}
else
if (X >= extrema[which][2])
55
{
range = (extrema[which][1]-extrema[which][2]);
allPos = (((.9*((X-extrema[which][2])/range))+.05)/2)+0.5;
}
return (allPos);
}
/*******************************************
Scale actual output to original range
*******************************************/
double ScaleOutput(double X, int which)
{
double range = extrema[which][1] - extrema[which][0];
double scaleUp = ((X-.05)/.9) * range;
return (extrema[which][0] + scaleUp);
}
/*************************************************
Scale back to the original value
**************************************************/
double ScaleOutputGeneralise(double X, int which)
{
double range, scaleUp, res;
if (X < 0.5)
{
range = extrema[which][2] - extrema[which][0];
scaleUp = ((((X*2)-.05)/.9) * range);
res = (extrema[which][0] + scaleUp);
//(((X*2) - 0.05)/0.9) * range
}
if (X >= 0.5)
{
range = extrema[which][1] - extrema[which][2];
scaleUp = (((((X-0.5)*2)-.05)/.9) * range);
56
res = (extrema[which][2] + scaleUp);
}
return (res);
}
/*******************************************
Run Test Data forward pass only
*******************************************/
void TestForward()
{
int GoodCount=0;
double Sum, TotalError=0, scaledTestCriteria;
cout << "Running Test Cases" << endl;
result_stream << "Running Test Cases" << endl;
for (int H=0; H < TestCases; H++)
{ TestInputsAndOutputs();
for (int J=0; J < NumRowsPer[0]-1; J++) CellArray[J][0].Output = Inputs[J];
/* hidden layers */
for (int I=1; I < NumOfCols-1; I++)
for (int J=0; J < NumRowsPer[I]-1; J++)
{ Sum = 0.0;
for (int K=0; K < NumRowsPer[I-1]; K++)
Sum += CellArray[J][I].Weights[K] * CellArray[K][I-1].Output;
CellArray[J][I].Output = 1.0 / (1.0+exp(-Sum));
CellArray[J][I].Error = 0.0;
}
/* output layer */
for (int J=0; J < NumOUTs; J++)
{ Sum = 0.0;
for (int K=0; K < NumRowsPer[NumOfCols-2]; K++)
Sum += CellArray[J][NumOfCols-1].Weights[K]
* CellArray[K][NumOfCols-2].Output;
CellArray[J][NumOfCols-1].Output = 1.0 / (1.0+exp(-Sum));
CellArray[J][NumOfCols-1].Error =
DesiredOutputs[J]-CellArray[J][NumOfCols-1].Output;
57
scaledTestCriteria = ScaleTestCriteria(CellArray[J][NumOfCols-1].Output, NumINs+J);
if (fabs(CellArray[J][NumOfCols-1].Error) <= scaledTestCriteria )
GoodCount++;
TotalError += CellArray[J][NumOfCols-1].Error *
CellArray[J][NumOfCols-1].Error;
}
GenReport(-1);
}
cout << endl;
result_stream << endl;
cout << "Sum Squared Error for Testing cases = " << TotalError << endl;
result_stream << "Sum Squared Error for Testing cases = " << TotalError << endl;
cout << "% of Testing Cases that meet criteria = " << ((double)GoodCount/(double)TestCases)*100;
result_stream << "% of Testing Cases that meet criteria = " << ((double)GoodCount/(double)TestCases)*100;
cout << endl;
result_stream << endl;
cout << endl;
result_stream << endl;
}
/*****************************************************
Run Training Data forward pass only, after training
******************************************************/
void TrainForward()
{
int GoodCount=0;
double Sum, TotalError=0, scaledCriteria;
cout << endl << "Confirm Training Cases" << endl;
result_stream << endl << "Confirm Training Cases" << endl;
for (int H=0; H < TrainCases; H++)
{ CalculateInputsAndOutputs ();
for (int J=0; J < NumRowsPer[0]-1; J++) CellArray[J][0].Output = Inputs[J];
/* hidden layers */
for (int I=1; I < NumOfCols-1; I++)
for (int J=0; J < NumRowsPer[I]-1; J++)
{ Sum = 0.0;
58
for (int K=0; K < NumRowsPer[I-1]; K++)
Sum += CellArray[J][I].Weights[K] * CellArray[K][I-1].Output;
CellArray[J][I].Output = 1.0 / (1.0+exp(-Sum));
CellArray[J][I].Error = 0.0;
}
/* output layer */
for (int J=0; J < NumOUTs; J++)
{ Sum = 0.0;
for (int K=0; K < NumRowsPer[NumOfCols-2]; K++)
Sum += CellArray[J][NumOfCols-1].Weights[K]
* CellArray[K][NumOfCols-2].Output;
CellArray[J][NumOfCols-1].Output = 1.0 / (1.0+exp(-Sum));
CellArray[J][NumOfCols-1].Error =
DesiredOutputs[J]-CellArray[J][NumOfCols-1].Output;
scaledCriteria = ScaleCriteria(CellArray[J][NumOfCols-1].Output, NumINs+J);
if (fabs(CellArray[J][NumOfCols-1].Error) <= scaledCriteria)
GoodCount++;
TotalError += CellArray[J][NumOfCols-1].Error *
CellArray[J][NumOfCols-1].Error;
}
GenReport(-1);
}
cout << endl;
result_stream << endl;
cout << "Sum Squared Error for Training cases = " << TotalError << endl;
result_stream << "Sum Squared Error for Training cases = " << TotalError << endl;
cout << "% of Training Cases that meet criteria = " <<
((double)GoodCount/(double)TrainCases)*100 << endl;
result_stream << "% of Training Cases that meet criteria = " <<
((double)GoodCount/(double)TrainCases)*100 << endl;
cout << endl;
result_stream << endl;
}
/*******************************************
Final Report
59
*******************************************/
void FinReport(int CIterations)
{
cout.setf(ios::fixed); cout.setf(ios::showpoint); cout.precision(4);
result_stream.setf(ios::fixed); result_stream.setf(ios::showpoint); result_stream.precision(4);
if (CIterations<CritrIt)
{
cout << "Network did not converge" << endl;
result_stream << "Network did not converge" << endl;
}
else
{
cout << "Converged to within criteria" << endl;
result_stream << "Converged to within criteria" << endl;
}
cout << "Total number of iterations = " << Iteration << endl;
result_stream << "Total number of iterations = " << Iteration << endl;
}
/*******************************************
Generation Report
pass in a -1 if running test cases
*******************************************/
void GenReport(int Iteration)
{
int J;
cout.setf(ios::fixed); cout.setf(ios::showpoint); cout.precision(4);
result_stream.setf(ios::fixed); result_stream.setf(ios::showpoint); result_stream.precision(4);
if (Iteration == -1)
{ for (J=0; J < NumRowsPer[0]-1; J++)
{
cout << " " << ScaleOutput(Inputs[J],J);
result_stream << " " << ScaleOutput(Inputs[J],J);
}
cout << " ";
result_stream << " ";
for (J=0; J < NumOUTs; J++)
{
cout << " " << ScaleOutputGeneralise(DesiredOutputs[J],NumINs+J);
60
result_stream << " " << ScaleOutputGeneralise(DesiredOutputs[J],NumINs+J);
}
cout << " ";
result_stream << " ";
for (J=0; J < NumOUTs; J++)
{
cout << " " << ScaleOutputGeneralise(CellArray[J][NumOfCols-1].Output,NumINs+J);
result_stream << " " << ScaleOutputGeneralise(CellArray[J][NumOfCols-1].Output,NumINs+J);
}
cout << endl;
result_stream << endl;
}
else if ((Iteration % ReportIntv) == 0)
{ cout << " " << Iteration << " ";
result_stream << " " << Iteration << " ";
for (J=0; J < NumRowsPer[0]-1; J++)
{
cout << " " << ScaleOutput(Inputs[J],J);
result_stream << " " << ScaleOutput(Inputs[J],J);
}
cout << " ";
result_stream << " ";
for (J=0; J < NumOUTs; J++)
{
cout << " " << ScaleOutputGeneralise(DesiredOutputs[J],NumINs+J);
result_stream << " " << ScaleOutputGeneralise(DesiredOutputs[J],NumINs+J);
}
cout << " ";
result_stream << " ";
for (J=0; J < NumOUTs; J++)
{
cout << " " << ScaleOutputGeneralise(CellArray[J][NumOfCols-1].Output,NumINs+J);
result_stream << " " << ScaleOutputGeneralise(CellArray[J][NumOfCols-1].Output,NumINs+J);
}
cout << endl;
result_stream << endl;
}
}
61
APPENDIX B
Datasets
The three datasets used in the Backpropagation algorithm:
1. Minimum weight steel beam problem datasets
0.40 .40 .190 .190
.6313
0.20 .20 .120 .240
.3630
0.35 .35 .035 .035
.1000
0.15 .15 .045 .120
.1400
0.15 .15 .035 .095
.1000
0.06 .06 .018 .123
.0490
0.12 .06 .018 .067
.1000
0.10 .10 .007 .028
.0233
0.06 .06 .004 .024
.0110
0.20 .20 .050 .100
.1590
2. 2D Projectile motion datasets
-185.68085 62.39953 0.0 33.0 7.0 3.2038
57.87422 136.03995
46.43233 -21.4281 0.0 40.0 0.0 0.80996 9.12676 335.2271
-197.26916 80.52932 0.0 49.0 0.0 3.7913
30.21106 157.79371
-45.51251 -66.32454 0.0 31.0 0.0 1.09526
62.44321 235.54173
130.62391 -164.34226 0.0 49.0 3.0 0.0 68.93617 270.0
61.96282 42.62375 0.0 31.0 2.0 0.6027 22.50215 34.52309
-42.51268 236.75163 0.0 46.0 5.0 1.97327
82.40679 41.62494
-8.97536 7.36584 0.0 33.0 1.0 3.96902 2.99796 140.31947
62
-29.53386 -73.83029 0.0 32.0 4.0 0.0
46.10695 225.0
171.80714 -108.08147 0.0 43.0 8.0 0.0 27.05438 315.0
10.25567 115.54738 0.0 46.0 7.0 3.34476
5.9313 -21.85311 0.0 40.0 0.0 0.5036
18.15832 71.96181
86.01389 285.1852
307.65628 170.32188 0.0 43.0 9.0 0.68538
44.1592 19.79431
-79.70994 0.37885 0.0 31.0 0.0 5.64623 27.18855 179.72769
0.0 108.02264 0.0 39.0 0.0 0.0 67.94637 90.0
-3.06196 11.22578 0.0 32.0 9.0 3.95672 3.29125 102.67458
-48.77648 101.84351 0.0 39.0 8.0 0.99979
54.85066 196.16885
267.25881 182.24948 0.0 40.0 8.0 0.79311
70.8592 7.23498
109.22732 -94.95332 0.0 43.0 3.0 1.81711
39.85647 310.59396
-66.24069 74.12901 0.0 37.0 8.0 1.09514
50.63463 200.09005
-291.67982 -76.96141 0.0 40.0 7.0 3.60032
53.89839 183.51809
-3.03388 131.93313 0.0 31.0 9.0 1.46092
76.80569 240.05522
32.17785 -3.89194 0.0 35.0 0.0 5.73113 7.51423 353.10353
53.00022 72.62676 0.0 49.0 9.0 5.33339
11.70271 64.28479
85.76912 101.11365 0.0 33.0 5.0 1.29635
33.46894 41.62803
-17.92859 338.2669 0.0 40.0 8.0 1.70127
69.49202 83.30323
-13.07071 255.31984 0.0 46.0 9.0 1.75899
27.03595 89.25971
192.01133 -162.59587 0.0 49.0 9.0 4.53268
46.49823 16.71117
-35.43567 31.33487 0.0 43.0 5.0 4.98891
56.16365 114.25625
-52.46014 66.44012 0.0 33.0 0.0 6.06239
24.81207 128.2941
89.23896 -78.41001 0.0 49.0 4.0 0.0
13.45551 315.0
-34.93675 501.70877 0.0 49.0 9.0 1.36357
68.64398 133.90507
-20.93785 -7.27552 0.0 35.0 9.0 1.21631
5.42342 203.02062
-39.84531 -15.16282 0.0 37.0 1.0 4.19117
8.77469 200.2623
63
-38.27994 37.26014 0.0 40.0 2.0 0.65763
9.60368 137.73263
-96.35594 -40.79194 0.0 31.0 4.0 2.34045
47.31952 227.32617
10.36563 6.58753 0.0 31.0 6.0 0.44178 61.9614 204.23973
-50.26037 -4.93124 0.0 31.0 9.0 5.36504
31.81111 156.64365
-56.69363 0.0 0.0 32.0 6.0 0.0 34.24794 180.0
-86.42147 100.59401 0.0 40.0 4.0 0.65914
28.73856 143.5829
80.85112 42.29208 0.0 31.0 8.0 1.58313
49.83195 328.02689
196.48956 0.0 0.0 43.0 3.0 0.0 30.8483 0.0
229.20318 -15.612 0.0 49.0 5.0 5.27109 63.24488 51.15651
110.58687 16.18088 0.0 37.0 1.0 2.66207
56.59575 3.13154
17.6044 -95.25832 0.0 31.0 3.0 0.05457 40.73669 265.3123
39.30388 -37.403 0.0 36.0 1.0 5.58348 83.9627 313.05209
175.17317 21.09589 0.0 43.0 1.0 5.10087
57.41801 15.72236
-167.18443 -4.74984 0.0 46.0 6.0 3.23038
78.13924 11.75959
-66.90301 20.68802 0.0 32.0 8.0 1.34695
21.84847 181.86529
-43.18237 -32.46084 0.0 36.0 0.0 3.19018
12.05541 216.93274
3. Wine recognition datasets
14.23 1.71 2.43 15.6 127 2.8 3.06 .28 2.29 5.64 1.04 3.92 1065 1
13.2 1.78 2.14 11.2 100 2.65 2.76 .26 1.28 4.38 1.05 3.4 1050 1
13.16 2.36 2.67 18.6 101 2.8 3.24 .3 2.81 5.68 1.03 3.17 1185 1
14.37 1.95 2.5 16.8 113 3.85 3.49 .24 2.18 7.8 .86 3.45 1480 1
13.24 2.59 2.87 21 118 2.8 2.69 .39 1.82 4.32 1.04 2.93 735 1
14.2 1.76 2.45 15.2 112 3.27 3.39 .34 1.97 6.75 1.05 2.85 1450 1
14.39 1.87 2.45 14.6 96 2.5 2.52 .3 1.98 5.25 1.02 3.58 1290 1
14.06 2.15 2.61 17.6 121 2.6 2.51 .31 1.25 5.05 1.06 3.58 1295 1
64
14.83 1.64 2.17 14 97 2.8 2.98 .29 1.98 5.2 1.08 2.85 1045 1
13.86 1.35 2.27 16 98 2.98 3.15 .22 1.85 7.22 1.01 3.55 1045 1
14.1 2.16 2.3 18 105 2.95 3.32 .22 2.38 5.75 1.25 3.17 1510 1
14.12 1.48 2.32 16.8 95 2.2 2.43 .26 1.57 5 1.17 2.82 1280 1
13.75 1.73 2.41 16 89 2.6 2.76 .29 1.81 5.6 1.15 2.9 1320 1
14.75 1.73 2.39 11.4 91 3.1 3.69 .43 2.81 5.4 1.25 2.73 1150 1
14.38 1.87 2.38 12 102 3.3 3.64 .29 2.96 7.5 1.2 3 1547
1
13.63 1.81 2.7 17.2 112 2.85 2.91 .3 1.46 7.3 1.28 2.88 1310 1
14.3 1.92 2.72 20 120 2.8 3.14 .33 1.97 6.2 1.07 2.65 1280 1
13.83 1.57 2.62 20 115 2.95 3.4 .4 1.72 6.6 1.13 2.57 1130 1
14.19 1.59 2.48 16.5 108 3.3 3.93 .32 1.86 8.7 1.23 2.82 1680 1
13.64 3.1 2.56 15.2 116 2.7 3.03 .17 1.66 5.1 .96 3.36 845 1
14.06 1.63 2.28 16 126 3 3.17 .24 2.1 5.65 1.09 3.71 780 1
12.93 3.8 2.65 18.6 102 2.41 2.41 .25 1.98 4.5 1.03 3.52 770 1
13.71 1.86 2.36 16.6 101 2.61 2.88 .27 1.69 3.8 1.11 4 1035 1
12.85 1.6 2.52 17.8 95 2.48 2.37 .26 1.46 3.93 1.09 3.63 1015 1
13.5 1.81 2.61 20 96 2.53 2.61 .28 1.66 3.52 1.12 3.82 845 1
13.05 2.05 3.22 25 124 2.63 2.68 .47 1.92 3.58 1.13 3.2 830 1
13.39 1.77 2.62 16.1 93 2.85 2.94 .34 1.45 4.8 .92 3.22 1195 1
13.3 1.72 2.14 17 94 2.4 2.19 .27 1.35 3.95 1.02 2.77 1285 1
13.87 1.9 2.8 19.4 107 2.95 2.97 .37 1.76 4.5 1.25 3.4 915 1
14.02 1.68 2.21 16 96 2.65 2.33 .26 1.98 4.7 1.04 3.59 1035 1
13.73 1.5 2.7 22.5 101 3 3.25 .29 2.38 5.7 1.19 2.71 1285 1
13.58 1.66 2.36 19.1 106 2.86 3.19 .22 1.95 6.9 1.09 2.88 1515 1
13.68 1.83 2.36 17.2 104 2.42 2.69 .42 1.97 3.84 1.23 2.87 990 1
13.76 1.53 2.7 19.5 132 2.95 2.74 .5 1.35 5.4 1.25 3 1235 1
65
13.51 1.8 2.65 19 110 2.35 2.53 .29 1.54 4.2 1.1 2.87 1095 1
13.48 1.81 2.41 20.5 100 2.7 2.98 .26 1.86 5.1 1.04 3.47 920 1
13.28 1.64 2.84 15.5 110 2.6 2.68 .34 1.36 4.6 1.09 2.78 880 1
13.05 1.65 2.55 18 98 2.45 2.43 .29 1.44 4.25 1.12 2.51 1105 1
13.07 1.5 2.1 15.5 98 2.4 2.64 .28 1.37 3.7 1.18 2.69 1020 1
14.22 3.99 2.51 13.2 128 3 3.04 .2 2.08 5.1 .89 3.53 760 1
13.56 1.71 2.31 16.2 117 3.15 3.29 .34 2.34 6.13 .95 3.38 795 1
13.41 3.84 2.12 18.8 90 2.45 2.68 .27 1.48 4.28 .91 3 1035 1
13.88 1.89 2.59 15 101 3.25 3.56 .17 1.7 5.43 .88 3.56 1095 1
13.24 3.98 2.29 17.5 103 2.64 2.63 .32 1.66 4.36 .82 3 680 1
13.05 1.77 2.1 17 107 3 3 .28 2.03 5.04 .88 3.35 885 1
14.21 4.04 2.44 18.9 111 2.85 2.65 .3 1.25 5.24 .87 3.33 1080 1
14.38 3.59 2.28 16 102 3.25 3.17 .27 2.19 4.9 1.04 3.44 1065 1
13.9 1.68 2.12 16 101 3.1 3.39 .21 2.14 6.1 .91 3.33 985 1
14.1 2.02 2.4 18.8 103 2.75 2.92 .32 2.38 6.2 1.07 2.75 1060 1
13.94 1.73 2.27 17.4 108 2.88 3.54 .32 2.08 8.90 1.12 3.1 1260 1
66
APPENDIX C
Results
The output results for each datasets are as follows:
1. Result for minimum weight problem datasets
The algorithm is executed using the linear scaling process. Below is the output result of
the execution:
886886 0.1200 0.0600 0.0180 0.0670 0.1000 0.0999
887887 0.1000 0.1000 0.0070 0.0280 0.0233 0.0216
888888 0.0600 0.0600 0.0040 0.0240 0.0110 0.0129
889889 0.2000 0.2000 0.0500 0.1000 0.1590 0.1590
890890 0.4000 0.4000 0.1900 0.1900 0.6313 0.6312
891891 0.2000 0.2000 0.1200 0.2400 0.3630 0.3630
892892 0.3500 0.3500 0.0350 0.0350 0.1000 0.1001
893893 0.1500 0.1500 0.0450 0.1200 0.1400 0.1403
894894 0.1500 0.1500 0.0350 0.0950 0.1000 0.0999
895895 0.0600 0.0600 0.0180 0.1230 0.0490 0.0487
896896 0.1200 0.0600 0.0180 0.0670 0.1000 0.0999
897897 0.1000 0.1000 0.0070 0.0280 0.0233 0.0216
898898 0.0600 0.0600 0.0040 0.0240 0.0110 0.0129
899899 0.2000 0.2000 0.0500 0.1000 0.1590 0.1590
Network did not converge
Total number of iterations = 900000
67
Confirm Training Cases
0.4000 0.4000 0.1900 0.1900 0.6313 0.6312
0.2000 0.2000 0.1200 0.2400 0.3630 0.3630
0.3500 0.3500 0.0350 0.0350 0.1000 0.1001
0.1500 0.1500 0.0450 0.1200 0.1400 0.1403
0.1500 0.1500 0.0350 0.0950 0.1000 0.0999
0.0600 0.0600 0.0180 0.1230 0.0490 0.0488
0.1200 0.0600 0.0180 0.0670 0.1000 0.1000
0.1000 0.1000 0.0070 0.0280 0.0233 0.0216
0.0600 0.0600 0.0040 0.0240 0.0110 0.0129
0.2000 0.2000 0.0500 0.1000 0.1590 0.1590
Sum Squared Error for Training cases = 0.0172
% of Training Cases that meet criteria = 80.00
Running Test Cases
0.2000 0.2000 0.0300 0.0600 0.1000 0.0791
0.3000 0.3000 0.0950 0.1270 0.3900 0.3873
0.1500 0.1500 0.0100 0.0270 0.0415 0.0318
0.4000 0.4000 0.1200 0.1200 0.4830 0.4943
0.2800 0.2800 0.1200 0.1710 0.3900 0.4794
0.1700 0.1700 0.0200 0.0470 0.0700 0.0522
Sum Squared Error for Testing cases = 0.0189
% of Testing Cases that meet criteria = 83.33
Then we apply the non-linear scaling to the algorithm and execute the algorithm. Below
is the output for the execution.
68
344344 0.1500 0.1500 0.0350 0.0950 0.1000 0.0919
345345 0.0600 0.0600 0.0180 0.1230 0.0490 0.0490
346346 0.1200 0.0600 0.0180 0.0670 0.1000 0.1000
347347 0.1000 0.1000 0.0070 0.0280 0.0233 0.0238
348348 0.0600 0.0600 0.0040 0.0240 0.0110 0.0099
349349 0.2000 0.2000 0.0500 0.1000 0.1590 0.1613
350350 0.4000 0.4000 0.1900 0.1900 0.6313 0.6287
351351 0.2000 0.2000 0.1200 0.2400 0.3630 0.3621
352352 0.3500 0.3500 0.0350 0.0350 0.1000 0.1000
353353 0.1500 0.1500 0.0450 0.1200 0.1400 0.1461
354354 0.1500 0.1500 0.0350 0.0950 0.1000 0.0922
355355 0.0600 0.0600 0.0180 0.1230 0.0490 0.0490
356356 0.1200 0.0600 0.0180 0.0670 0.1000 0.1000
357357 0.1000 0.1000 0.0070 0.0280 0.0233 0.0237
358358 0.0600 0.0600 0.0040 0.0240 0.0110 0.0100
Converged to within criteria
Total number of iterations = 359299
Confirm Training Cases
0.2000 0.2000 0.0500 0.1000 0.1590 0.1612
0.4000 0.4000 0.1900 0.1900 0.6313 0.6288
0.2000 0.2000 0.1200 0.2400 0.3630 0.3623
0.3500 0.3500 0.0350 0.0350 0.1000 0.1003
0.1500 0.1500 0.0450 0.1200 0.1400 0.1462
0.1500 0.1500 0.0350 0.0950 0.1000 0.0931
0.0600 0.0600 0.0180 0.1230 0.0490 0.0491
0.1200 0.0600 0.0180 0.0670 0.1000 0.1003
0.1000 0.1000 0.0070 0.0280 0.0233 0.0237
0.0600 0.0600 0.0040 0.0240 0.0110 0.0100
Sum Squared Error for Training cases = 0.0001
% of Training Cases that meet criteria = 100
69
Running Test Cases
0.2000 0.2000 0.0300 0.0600 0.1000 0.1020
0.3000 0.3000 0.0950 0.1270 0.3900 0.3746
0.1500 0.1500 0.0100 0.0270 0.0415 0.0632
0.4000 0.4000 0.1200 0.1200 0.4830 0.4994
0.2800 0.2800 0.1200 0.1710 0.3900 0.4814
0.1700 0.1700 0.0200 0.0470 0.0700 0.0905
Sum Squared Error for Testing cases = 0.0307
% of Testing Cases that meet criteria = 100
2. Result for two dimensional projectile motion dataset
This datasets is applied to the linear scaling execution of the backpropagation algorithm.
Below is the output result of this execution.
70
89977888 72.7617 44.5306 0.0000 36.0000 2.0000 4.3939 58.4399 43.8725 48.7751 104.7192
89978889 -172.7895 -32.8454 0.0000 39.0000 3.0000 4.3372 47.5066 174.3488 55.0915 149.3253
89979890 -21.1724 23.4599 0.0000 35.0000 5.0000 0.0000 7.6959 135.0000 27.2050 180.0532
89980891 110.5869 16.1809 0.0000 37.0000 1.0000 2.6621 56.5958 3.1315 50.9714 199.3150
89981892 261.0820 46.3084 0.0000 40.0000 8.0000 0.0000 81.7613 90.0000 69.1375 100.5568
89982893 23.7422 -22.4497 0.0000 32.0000 2.0000 5.0736 84.6955 62.7784 54.7421 225.9713
89983894 -42.9145 -19.9635 0.0000 32.0000 5.0000 0.2630 15.9800 203.5113 35.1318 186.7456
89984895 43.8675 111.6771 0.0000 36.0000 7.0000 0.0000 28.8077 90.0000 46.1808 105.9781
89985896 -13.0707 255.3198 0.0000 46.0000 9.0000 1.7590 27.0359 89.2597 51.0034 90.3390
89986897 -79.0395 41.7969 0.0000 33.0000 9.0000 1.7895 21.1974 167.8664 29.6163 205.0539
89987898 -19.6245 -177.9393 0.0000 40.0000 2.0000 3.6936 42.0222 272.0479 45.2451 273.9994
89988899 -32.5601 71.8381 0.0000 40.0000 6.0000 4.7127 18.2334 109.6095 35.5439 127.9712
89989900 31.1128 -148.6197 0.0000 43.0000 1.0000 0.0000 64.0141 270.0000 53.6213 287.1136
89990901 171.8071 -108.0815 0.0000 43.0000 8.0000 0.0000 27.0544 315.0000 41.7095 284.4379
89991902 -171.9108 18.1377 0.0000 36.0000 8.0000 3.4680 71.1371 82.0294 68.8775 160.8337
89992903 -74.4172 101.2078 0.0000 40.0000 9.0000 2.2764 18.1455 125.0931 46.5662 171.3997
89993904 -148.1408 -91.9054 0.0000 32.0000 8.0000 3.1597 49.0996 240.6482 53.6533 225.2897
89994905 125.9776 -16.5148 0.0000 46.0000 2.0000 5.4039 78.1282 33.9058 46.4541 176.2502
89995906 -27.2345 -52.1089 0.0000 32.0000 6.0000 2.5699 51.3839 292.1707 41.0238 202.6173
89996907 30.2260 1.8990 0.0000 32.0000 2.0000 2.7425 70.3060 348.8768 58.3618 168.6078
89997908 -2.2989 71.5865 0.0000 40.0000 1.0000 1.2326 82.4052 107.9166 39.6973 98.3865
89998909 152.0903 42.7557 0.0000 31.0000 8.0000 0.0000 77.0752 90.0000 56.6407 102.4234
89999910 91.4196 -72.5273 0.0000 46.0000 1.0000 5.5390 15.8439 321.6954 40.5753 224.6502
Network did not converge
Total number of iterations = 90000000
71
Confirm Training Cases
178.9687 23.4961 0.0000 46.0000 5.0000 0.0125 21.9821 8.8686 64.6341 113.7757
230.7672 130.0264 0.0000 46.0000 3.0000 0.0000 60.8047 45.0000 51.8464 -19.9321
-21.1724 23.4599 0.0000 35.0000 5.0000 0.0000 7.6959 135.0000 36.6128 177.0297
-8.8855 86.9114 0.0000 36.0000 7.0000 6.0688 26.2457 115.3928 42.0572 134.7640
104.1476 67.7624 0.0000 31.0000 3.0000 0.0000 51.1223 45.0000 66.8488 -19.9321
11.7676 -92.6438 0.0000 40.0000 4.0000 0.0000 17.2861 270.0000 54.8268 288.5219
69.6204 -9.6036 0.0000 43.0000 9.0000 5.3348 53.1830 109.3070 29.8834 186.5350
25.2443 -25.2443 0.0000 40.0000 0.0000 0.0000 83.6846 315.0000 58.6743 245.8078
2.2759 -52.7335 0.0000 35.0000 5.0000 0.6854 13.6466 266.7787 51.6609 252.8812
10.1286 -133.8500 0.0000 32.0000 7.0000 4.9717 82.0924 164.3093 75.9074 242.7174
-117.9191 81.4699 0.0000 39.0000 8.0000 2.2939 71.2305 290.0352 43.1565 167.2508
51.0201 0.5967 0.0000 35.0000 2.0000 0.0000 89.8632 90.0000 58.0049 100.6049
-234.9254 -39.5119 0.0000 40.0000 7.0000 3.8667 64.5243 136.9512 61.2702 182.8140
245.8805 -161.6241 0.0000 37.0000 9.0000 5.9184 71.0984 291.4706 78.8932 318.4534
-258.3213 171.8963 0.0000 43.0000 8.0000 2.3965 82.4826 223.0216 72.9146 118.7620
44.6163 -90.0276 0.0000 33.0000 4.0000 4.1680 64.2199 341.8543 63.1254 282.0471
-234.4236 -131.9078 0.0000 40.0000 7.0000 3.6872 82.8753 198.4846 62.1360 184.4935
92.7752 -74.8569 0.0000 35.0000 3.0000 0.0000 28.9386 315.0000 64.6133 361.5471
-83.1771 13.3336 0.0000 37.0000 2.0000 0.6102 19.8037 173.8225 38.5355 178.5257
41.6193 25.2473 0.0000 40.0000 7.0000 0.9806 7.9414 28.8339 36.5113 162.7288
5.8573 59.3921 0.0000 33.0000 3.0000 1.8828 83.0231 350.5439 46.6688 130.2809
78.1169 34.8598 0.0000 33.0000 5.0000 2.2508 30.5585 7.1259 52.5647 142.8254
-30.3512 -140.0638 0.0000 40.0000 0.0000 1.1769 59.3108 257.7733 64.2965 295.4913
77.9839 -170.2764 0.0000 40.0000 8.0000 5.1735 77.7614 121.4381 75.5442 248.7630
Sum Squared Error for Training cases = 49.8366
% of Training Cases that meet criteria = 4.84
72
Running Test Cases
-136.5831 10.4916 0.0000 39.0000 7.0000 1.9283 26.7958 194.7109 47.2232 187.9361
34.3398 210.0049 0.0000 49.0000 1.0000 4.5090 34.5423 80.5470 55.5612 84.1371
82.9870 -124.3327 0.0000 32.0000 9.0000 4.9309 33.7814 316.5595 75.4170 245.3352
-48.6279 40.2555 0.0000 35.0000 9.0000 0.5604 37.4732 182.3330 31.0888 190.3178
153.0100 174.3223 0.0000 46.0000 4.0000 1.3613 73.0171 7.9212 55.5231 84.4360
368.8978 -0.0000 0.0000 49.0000 9.0000 0.0000 34.0978 0.0000 86.1550 153.1886
-165.1387 119.6764 0.0000 37.0000 4.0000 2.7543 53.3036 136.5933 64.9642 119.2699
15.6687 -31.5330 0.0000 37.0000 7.0000 0.5740 7.4114 291.1254 38.8739 209.5419
-62.0044 -41.2591 0.0000 32.0000 4.0000 5.8651 27.5143 203.2999 43.6235 201.7343
-317.2747 23.8801 0.0000 49.0000 6.0000 3.8348 40.9752 154.0088 62.2519 179.2567
-59.0752 -40.2403 0.0000 31.0000 6.0000 3.9809 18.7016 211.4226 44.0200 204.6961
152.6800 -0.0000 0.0000 32.0000 4.0000 0.0000 62.5278 0.0000 63.4093 -19.9315
109.5231 -110.5396 0.0000 40.0000 7.0000 0.4121 31.4505 290.6868 61.7914 292.2250
-51.7418 -13.9461 0.0000 40.0000 1.0000 2.0395 79.8504 229.0027 21.3260 237.4694
203.8975 -150.3364 0.0000 46.0000 2.0000 5.6647 59.8957 323.2586 65.7052 370.1521
-0.2333 0.3988 0.0000 43.0000 9.0000 0.3752 0.0702 120.3868 27.8146 196.5201
108.3641 -148.6955 0.0000 43.0000 9.0000 5.1323 68.8050 96.5263 69.6681 241.5652
17.3378 -97.6280 0.0000 37.0000 7.0000 0.0000 49.3766 225.0000 52.3217 289.5082
Sum Squared Error for Testing cases = 1.8085
% of Testing Cases that meet criteria = 22.22
Then we apply the non-linear scaling to the algorithm and execute the algorithm. Below
is the output for the execution.
73
89977888 72.7617 44.5306 0.0000 36.0000 2.0000 4.3939 58.4399 43.8725 46.0570 91.6389
89978889 -172.7895 -32.8454 0.0000 39.0000 3.0000 4.3372 47.5066 174.3489 54.7174 151.1034
89979890 -21.1724 23.4599 0.0000 35.0000 5.0000 0.0000 7.6959 135.0000 26.1861 188.3340
89980891 110.5869 16.1809 0.0000 37.0000 1.0000 2.6621 56.5958 3.1315 50.7372 224.9430
89981892 261.0820 46.3084 0.0000 40.0000 8.0000 0.0000 81.7612 90.0000 75.3395 -7.4797
89982893 23.7422 -22.4497 0.0000 32.0000 2.0000 5.0736 84.6955 62.7784 49.8706 255.6838
89983894 -42.9145 -19.9635 0.0000 32.0000 5.0000 0.2630 15.9800 203.5113 23.9576 205.8927
89984895 43.8675 111.6771 0.0000 36.0000 7.0000 0.0000 28.8077 90.0000 46.8638 126.0627
89985896 -13.0707 255.3198 0.0000 46.0000 9.0000 1.7590 27.0360 89.2597 53.3963 72.4230
89986897 -79.0395 41.7969 0.0000 33.0000 9.0000 1.7895 21.1974 167.8664 31.3773 224.1386
89987898 -19.6245 -177.9393 0.0000 40.0000 2.0000 3.6936 42.0222 272.0479 42.8211 313.6628
89988899 -32.5601 71.8381 0.0000 40.0000 6.0000 4.7127 18.2334 109.6095 41.5594 107.9797
89989900 31.1128 -148.6197 0.0000 43.0000 1.0000 0.0000 64.0141 270.0000 52.2251 274.2804
89990901 171.8071 -108.0815 0.0000 43.0000 8.0000 0.0000 27.0544 315.0000 46.8173 299.3389
89991902 -171.9108 18.1377 0.0000 36.0000 8.0000 3.4680 71.1371 82.0294 67.1349 150.1788
89992903 -74.4172 101.2078 0.0000 40.0000 9.0000 2.2764 18.1455 125.0931 50.9134 184.5170
89993904 -148.1408 -91.9054 0.0000 32.0000 8.0000 3.1597 49.0996 240.6482 40.3980 266.6870
89994905 125.9776 -16.5148 0.0000 46.0000 2.0000 5.4039 78.1282 33.9058 38.9918 181.3696
89995906 -27.2345 -52.1089 0.0000 32.0000 6.0000 2.5699 51.3839 292.1707 25.0806 236.7288
89996907 30.2260 1.8990 0.0000 32.0000 2.0000 2.7425 70.3060 348.8768 47.1975 200.4385
89997908 -2.2989 71.5865 0.0000 40.0000 1.0000 1.2326 82.4052 107.9166 36.2531 112.0635
89998909 152.0903 42.7557 0.0000 31.0000 8.0000 0.0000 77.0752 90.0000 55.1508 -9.7306
89999910 91.4196 -72.5273 0.0000 46.0000 1.0000 5.5390 15.8439 321.6954 46.0183 185.1858
Network did not converge
Total number of iterations = 90000000
74
Confirm Training Cases
178.9687 23.4961 0.0000 46.0000 5.0000 0.0125 21.9821 8.8686 53.6210 131.0318
230.7672 130.0264 0.0000 46.0000 3.0000 0.0000 60.8046 45.0000 60.1170 -9.9999
-21.1724 23.4599 0.0000 35.0000 5.0000 0.0000 7.6959 135.0000 39.6184 186.0025
-8.8855 86.9114 0.0000 36.0000 7.0000 6.0688 26.2457 115.3928 41.9491 135.4553
104.1476 67.7624 0.0000 31.0000 3.0000 0.0000 51.1223 45.0000 69.8545 -9.9995
11.7676 -92.6438 0.0000 40.0000 4.0000 0.0000 17.2861 270.0000 49.0758 279.7641
69.6204 -9.6036 0.0000 43.0000 9.0000 5.3348 53.1830 109.3070 32.4829 180.1147
25.2443 -25.2443 0.0000 40.0000 0.0000 0.0000 83.6846 315.0000 53.8916 197.8363
2.2759 -52.7335 0.0000 35.0000 5.0000 0.6854 13.6466 266.7787 46.9608 260.0039
10.1286 -133.8500 0.0000 32.0000 7.0000 4.9717 82.0924 164.3093 69.2704 266.0002
-117.9191 81.4699 0.0000 39.0000 8.0000 2.2939 71.2305 290.0352 43.9545 179.6308
51.0201 0.5967 0.0000 35.0000 2.0000 0.0000 89.8632 90.0000 63.5299 56.4707
-234.9254 -39.5119 0.0000 40.0000 7.0000 3.8667 64.5243 136.9512 62.5969 171.8357
245.8805 -161.6241 0.0000 37.0000 9.0000 5.9184 71.0984 291.4706 69.7580 278.4772
-258.3213 171.8963 0.0000 43.0000 8.0000 2.3965 82.4826 223.0216 72.1274 131.7584
44.6163 -90.0276 0.0000 33.0000 4.0000 4.1680 64.2199 341.8543 67.9425 263.2658
-234.4236 -131.9078 0.0000 40.0000 7.0000 3.6872 82.8753 198.4846 62.4416 177.4301
92.7752 -74.8569 0.0000 35.0000 3.0000 0.0000 28.9385 315.0000 67.7422 326.8099
-83.1771 13.3336 0.0000 37.0000 2.0000 0.6102 19.8037 173.8225 38.4212 193.8217
41.6193 25.2473 0.0000 40.0000 7.0000 0.9806 7.9414 28.8339 40.2185 170.1645
5.8573 59.3921 0.0000 33.0000 3.0000 1.8828 83.0230 350.5439 46.7700 150.4367
78.1169 34.8598 0.0000 33.0000 5.0000 2.2508 30.5585 7.1259 49.2888 169.7477
-30.3512 -140.0638 0.0000 40.0000 0.0000 1.1769 59.3108 257.7733 62.1667 304.8369
77.9839 -170.2764 0.0000 40.0000 8.0000 5.1735 77.7614 121.4381 67.8393 268.6543
Sum Squared Error for Training cases = 59.0457
% of Training Cases that meet criteria = 12.168
75
Running Test Cases
-136.5831 10.4916 0.0000 39.0000 7.0000 1.9283 26.7958 194.7109 44.8325 226.2114
34.3398 210.0049 0.0000 49.0000 1.0000 4.5090 34.5423 80.5470 55.0247 66.7159
82.9870 -124.3327 0.0000 32.0000 9.0000 4.9309 33.7814 316.5595 68.5860 268.8288
-48.6279 40.2555 0.0000 35.0000 9.0000 0.5604 37.4732 182.3330 36.8600 190.5780
153.0100 174.3223 0.0000 46.0000 4.0000 1.3613 73.0171 7.9212 47.6661 111.9258
368.8978 -0.0000 0.0000 49.0000 9.0000 0.0000 34.0978 0.0000 78.9937 178.7031
-165.1387 119.6764 0.0000 37.0000 4.0000 2.7543 53.3036 136.5933 66.8384 121.7134
15.6687 -31.5330 0.0000 37.0000 7.0000 0.5740 7.4114 291.1255 33.3119 224.1601
-62.0044 -41.2591 0.0000 32.0000 4.0000 5.8651 27.5143 203.2999 47.1139 191.4429
-317.2747 23.8801 0.0000 49.0000 6.0000 3.8348 40.9752 154.0088 62.9408 171.2888
-59.0752 -40.2403 0.0000 31.0000 6.0000 3.9809 18.7016 211.4226 44.8314 188.5400
152.6800 -0.0000 0.0000 32.0000 4.0000 0.0000 62.5278 0.0000 71.7174 -9.8828
109.5231 -110.5396 0.0000 40.0000 7.0000 0.4121 31.4505 290.6868 63.3577 296.6970
-51.7418 -13.9461 0.0000 40.0000 1.0000 2.0395 79.8504 229.0027 30.5632 213.2331
203.8975 -150.3364 0.0000 46.0000 2.0000 5.6647 59.8957 323.2586 68.3636 314.5242
-0.2333 0.3988 0.0000 43.0000 9.0000 0.3752 0.0702 120.3868 29.7459 198.8738
108.3641 -148.6955 0.0000 43.0000 9.0000 5.1323 68.8050 96.5263 63.8980 260.3573
17.3378 -97.6280 0.0000 37.0000 7.0000 0.0000 49.3766 225.0000 50.1579 285.8929
Sum Squared Error for Testing cases = 2.1026
% of Testing Cases that meet criteria = 15.556
3. Result for Wine Recognition data
This datasets is applied to the linear scaling execution of the backpropagation algorithm.
Below is the output result of this execution.
76
260260 13.5000 1.8100 2.6100 20.0000 96.0000 2.5300 2.6100 0.2800 1.6600 3.5200 1.1200 3.8200 845.0000 1.0000
1.0407
261261 12.6000 2.4600 2.2000 18.5000 94.0000 1.6200 0.6600 0.6300 0.9400 7.1000 0.7300 1.5800 695.0000 3.0000
3.0017
262262 13.3400 0.9400 2.3600 17.0000 110.0000 2.5300 1.3000 0.5500 0.4200 3.1700 1.0200 1.9300 750.0000 2.0000
2.0011
263263 13.2000 1.7800 2.1400 11.2000 100.0000 2.6500 2.7600 0.2600 1.2800 4.3800 1.0500 3.4000 1050.0000 1.0000
0.9780
264264 11.7600 2.6800 2.9200 20.0000 103.0000 1.7500 2.0300 0.6000 1.0500 3.8000 1.2300 2.5000 607.0000 2.0000
1.9211
265265 14.2100 4.0400 2.4400 18.9000 111.0000 2.8500 2.6500 0.3000 1.2500 5.2400 0.8700 3.3300 1080.0000 1.0000
1.0663
266266 13.8400 4.1200 2.3800 19.5000 89.0000 1.8000 0.8300 0.4800 1.5600 9.0100 0.5700 1.6400 480.0000 3.0000
3.0046
267267 12.0800 1.3300 2.3000 23.6000 70.0000 2.2000 1.5900 0.4200 1.3800 1.7400 1.0700 3.2100 625.0000 2.0000
2.0348
268268 13.7100 1.8600 2.3600 16.6000 101.0000 2.6100 2.8800 0.2700 1.6900 3.8000 1.1100 4.0000 1035.0000 1.0000
0.9802
269269 12.7000 3.5500 2.3600 21.5000 106.0000 1.7000 1.2000 0.1700 0.8400 5.0000 0.7800 1.2900 600.0000 3.0000
3.0420
270270 13.1100 1.0100 1.7000 15.0000 78.0000 2.9800 3.1800 0.2600 2.2800 5.3000 1.1200 3.1800 502.0000 2.0000
2.0189
271271 14.1300 4.1000 2.7400 24.5000 96.0000 2.0500 0.7600 0.5600 1.3500 9.2000 0.6100 1.6000 560.0000 3.0000
2.9957
272272 11.4600 3.7400 1.8200 19.5000 107.0000 3.1800 2.5800 0.2400 3.5800 2.9000 0.7500 2.8100 562.0000 2.0000
2.0450
273273 13.2400 3.9800 2.2900 17.5000 103.0000 2.6400 2.6300 0.3200 1.6600 4.3600 0.8200 3.0000 680.0000 1.0000
1.0531
274274 12.5800 1.2900 2.1000 20.0000 103.0000 1.4800 0.5800 0.5300 1.4000 7.6000 0.5800 1.5500 640.0000 3.0000
3.0433
Converged to within criteria
Total number of iterations = 275049
77
Confirm Training Cases
12.8500 1.6000 2.5200 17.8000 95.0000 2.4800 2.3700 0.2600 1.4600 3.9300 1.0900 3.6300 1015.0000 1.0000
1.0222
13.5000 1.8100 2.6100 20.0000 96.0000 2.5300 2.6100 0.2800 1.6600 3.5200 1.1200 3.8200 845.0000 1.0000
1.0351
13.0500 2.0500 3.2200 25.0000 124.0000 2.6300 2.6800 0.4700 1.9200 3.5800 1.1300 3.2000 830.0000 1.0000
1.0925
13.3900 1.7700 2.6200 16.1000 93.0000 2.8500 2.9400 0.3400 1.4500 4.8000 0.9200 3.2200 1195.0000 1.0000
0.9826
13.3000 1.7200 2.1400 17.0000 94.0000 2.4000 2.1900 0.2700 1.3500 3.9500 1.0200 2.7700 1285.0000 1.0000
1.0050
13.8700 1.9000 2.8000 19.4000 107.0000 2.9500 2.9700 0.3700 1.7600 4.5000 1.2500 3.4000 915.0000 1.0000
0.9845
14.0200 1.6800 2.2100 16.0000 96.0000 2.6500 2.3300 0.2600 1.9800 4.7000 1.0400 3.5900 1035.0000 1.0000
0.9831
13.7300 1.5000 2.7000 22.5000 101.0000 3.0000 3.2500 0.2900 2.3800 5.7000 1.1900 2.7100 1285.0000 1.0000
0.9824
13.5800 1.6600 2.3600 19.1000 106.0000 2.8600 3.1900 0.2200 1.9500 6.9000 1.0900 2.8800 1515.0000 1.0000
0.9825
13.6800 1.8300 2.3600 17.2000 104.0000 2.4200 2.6900 0.4200 1.9700 3.8400 1.2300 2.8700 990.0000 1.0000
1.0146
13.7600 1.5300 2.7000 19.5000 132.0000 2.9500 2.7400 0.5000 1.3500 5.4000 1.2500 3.0000 1235.0000 1.0000
0.9849
13.5100 1.8000 2.6500 19.0000 110.0000 2.3500 2.5300 0.2900 1.5400 4.2000 1.1000 2.8700 1095.0000 1.0000
0.9915
13.4800 1.8100 2.4100 20.5000 100.0000 2.7000 2.9800 0.2600 1.8600 5.1000 1.0400 3.4700 920.0000 1.0000
1.0307
13.2800 1.6400 2.8400 15.5000 110.0000 2.6000 2.6800 0.3400 1.3600 4.6000 1.0900 2.7800 880.0000 1.0000
0.9861
13.0500 1.6500 2.5500 18.0000 98.0000 2.4500 2.4300 0.2900 1.4400 4.2500 1.1200 2.5100 1105.0000 1.0000
1.0132
13.0700 1.5000 2.1000 15.5000 98.0000 2.4000 2.6400 0.2800 1.3700 3.7000 1.1800 2.6900 1020.0000 1.0000
1.1099
Sum Squared Error for Training cases = 0.0596
% of Training Cases that meet criteria = 99.4382
78
Running Test Cases
14.3700 1.9500 2.5000 16.8000 113.0000 3.8500 3.4900 0.2400 2.1800 7.8000 0.8600 3.4500 1480.0000 1.0000
1.0179
14.1000 2.1600 2.3000 18.0000 105.0000 2.9500 3.3200 0.2200 2.3800 5.7500 1.2500 3.1700 1510.0000 1.0000
0.9749
13.0500 1.7300 2.0400 12.4000 92.0000 2.7200 3.2700 0.1700 2.9100 7.2000 1.1200 2.9100 1150.0000 1.0000
0.9917
13.7400 1.6700 2.2500 16.4000 118.0000 2.6000 2.9000 0.2100 1.6200 5.8500 0.9200 3.2000 1060.0000 1.0000
0.9817
13.5600 1.7300 2.4600 20.5000 116.0000 2.9600 2.7800 0.2000 2.4500 6.2500 0.9800 3.0300 1120.0000 1.0000
0.9911
12.7200 1.8100 2.2000 18.8000 86.0000 2.2000 2.5300 0.2600 1.7700 3.9000 1.1600 3.1400 714.0000 2.0000
1.9294
12.4700 1.5200 2.2000 19.0000 162.0000 2.5000 2.2700 0.3200 3.2800 2.6000 1.1600 2.6300 937.0000 2.0000
1.9731
11.6100 1.3500 2.7000 20.0000 94.0000 2.7400 2.9200 0.2900 2.4900 2.6500 0.9600 3.2600 680.0000 2.0000
2.0100
11.8700 4.3100 2.3900 21.0000 82.0000 2.8600 3.0300 0.2100 2.9100 2.8000 0.7500 3.6400 380.0000 2.0000
2.0596
12.0700 2.1600 2.1700 21.0000 85.0000 2.6000 2.6500 0.3700 1.3500 2.7600 0.8600 3.2800 378.0000 2.0000
2.0362
12.8600 1.3500 2.3200 18.0000 122.0000 1.5100 1.2500 0.2100 0.9400 4.1000 0.7600 1.2900 630.0000 3.0000
2.9178
13.0800 3.9000 2.3600 21.5000 113.0000 1.4100 1.3900 0.3400 1.1400 9.4000 0.5700 1.3300 550.0000 3.0000
3.0236
14.3400 1.6800 2.7000 25.0000 98.0000 2.8000 1.3100 0.5300 2.7000 13.0000 0.5700 1.9600 660.0000 3.0000
2.9585
13.4800 1.6700 2.6400 22.5000 89.0000 2.6000 1.1000 0.5200 2.2900 11.7500 0.5700 1.7800 620.0000 3.0000
2.9972
13.7100 5.6500 2.4500 20.5000 95.0000 1.6800 0.6100 0.5200 1.0600 7.7000 0.6400 1.7400 740.0000 3.0000
2.9643
Sum Squared Error for Testing cases = 0.0045
% of Testing Cases that meet criteria = 100.0000
Then we apply the non-linear scaling to the algorithm and execute the algorithm. Below
is the output for the execution.
79
290290 13.5000 3.1200 2.6200 24.0000 123.0000 1.4000 1.5700 0.2200 1.2500 8.6000 0.5900 1.3000 500.0000
3.0000 3.0386
291291 13.0500 3.8600 2.3200 22.5000 85.0000 1.6500 1.5900 0.6100 1.6200 4.8000 0.8400 2.0100 515.0000
2.0000 2.0194
292292 14.3000 1.9200 2.7200 20.0000 120.0000 2.8000 3.1400 0.3300 1.9700 6.2000 1.0700 2.6500 1280.0000
1.0000 0.9919
293293 11.7900 2.1300 2.7800 28.5000 92.0000 2.1300 2.2400 0.5800 1.7600 3.0000 0.9700 2.4400 466.0000
2.0000 1.9906
294294 12.3300 1.1000 2.2800 16.0000 101.0000 2.0500 1.0900 0.6300 0.4100 3.2700 1.2500 1.6700 680.0000
2.0000 2.0120
295295 12.7700 2.3900 2.2800 19.5000 86.0000 1.3900 0.5100 0.4800 0.6400 9.9000 0.5700 1.6300 470.0000
3.0000 3.0509
296296 12.5100 1.7300 1.9800 20.5000 85.0000 2.2000 1.9200 0.3200 1.4800 2.9400 1.0400 3.5700 672.0000
2.0000 2.0025
297297 13.0500 1.6500 2.5500 18.0000 98.0000 2.4500 2.4300 0.2900 1.4400 4.2500 1.1200 2.5100 1105.0000
1.0000 1.0075
298298 13.3200 3.2400 2.3800 21.5000 92.0000 1.9300 0.7600 0.4500 1.2500 8.4200 0.5500 1.6200 650.0000
3.0000 3.0420
299299 12.7200 1.8100 2.2000 18.8000 86.0000 2.2000 2.5300 0.2600 1.7700 3.9000 1.1600 3.1400 714.0000
2.0000 2.0321
300300 14.3800 1.8700 2.3800 12.0000 102.0000 3.3000 3.6400 0.2900 2.9600 7.5000 1.2000 3.0000 1547.0000
1.0000 0.9786
301301 12.0700 2.1600 2.1700 21.0000 85.0000 2.6000 2.6500 0.3700 1.3500 2.7600 0.8600 3.2800 378.0000
2.0000 2.0124
302302 13.7200 1.4300 2.5000 16.7000 108.0000 3.4000 3.6700 0.1900 2.0400 6.8000 0.8900 2.8700 1285.0000
1.0000 0.9806
303303 13.4000 4.6000 2.8600 25.0000 112.0000 1.9800 0.9600 0.2700 1.1100 8.5000 0.6700 1.9200 630.0000
3.0000 3.0243
304304 12.3400 2.4500 2.4600 21.0000 98.0000 2.5600 2.1100 0.3400 1.3100 2.8000 0.8000 3.3800 438.0000
2.0000 2.0512
305305 13.4800 1.8100 2.4100 20.5000 100.0000 2.7000 2.9800 0.2600 1.8600 5.1000 1.0400 3.4700 920.0000
1.0000 1.0297
306306 13.8800 5.0400 2.2300 20.0000 80.0000 0.9800 0.3400 0.4000 0.6800 4.9000 0.5800 1.3300 415.0000
3.0000 3.0467
Converged to within criteria
Total number of iterations = 306377
80
Confirm Training Cases
13.5000 1.8100 2.6100 20.0000 96.0000 2.5300 2.6100 0.2800 1.6600 3.5200 1.1200 3.8200 845.0000 1.0000
1.0242
13.0500 2.0500 3.2200 25.0000 124.0000 2.6300 2.6800 0.4700 1.9200 3.5800 1.1300 3.2000 830.0000 1.0000
1.0880
13.3900 1.7700 2.6200 16.1000 93.0000 2.8500 2.9400 0.3400 1.4500 4.8000 0.9200 3.2200 1195.0000 1.0000
0.9815
13.3000 1.7200 2.1400 17.0000 94.0000 2.4000 2.1900 0.2700 1.3500 3.9500 1.0200 2.7700 1285.0000 1.0000
0.9941
13.8700 1.9000 2.8000 19.4000 107.0000 2.9500 2.9700 0.3700 1.7600 4.5000 1.2500 3.4000 915.0000 1.0000
0.9858
14.0200 1.6800 2.2100 16.0000 96.0000 2.6500 2.3300 0.2600 1.9800 4.7000 1.0400 3.5900 1035.0000 1.0000
0.9832
13.7300 1.5000 2.7000 22.5000 101.0000 3.0000 3.2500 0.2900 2.3800 5.7000 1.1900 2.7100 1285.0000 1.0000
0.9855
13.5800 1.6600 2.3600 19.1000 106.0000 2.8600 3.1900 0.2200 1.9500 6.9000 1.0900 2.8800 1515.0000 1.0000
0.9818
13.6800 1.8300 2.3600 17.2000 104.0000 2.4200 2.6900 0.4200 1.9700 3.8400 1.2300 2.8700 990.0000 1.0000
1.0105
13.7600 1.5300 2.7000 19.5000 132.0000 2.9500 2.7400 0.5000 1.3500 5.4000 1.2500 3.0000 1235.0000 1.0000
0.9828
13.5100 1.8000 2.6500 19.0000 110.0000 2.3500 2.5300 0.2900 1.5400 4.2000 1.1000 2.8700 1095.0000 1.0000
0.9871
13.4800 1.8100 2.4100 20.5000 100.0000 2.7000 2.9800 0.2600 1.8600 5.1000 1.0400 3.4700 920.0000 1.0000
1.0290
13.2800 1.6400 2.8400 15.5000 110.0000 2.6000 2.6800 0.3400 1.3600 4.6000 1.0900 2.7800 880.0000 1.0000
0.9843
13.0500 1.6500 2.5500 18.0000 98.0000 2.4500 2.4300 0.2900 1.4400 4.2500 1.1200 2.5100 1105.0000 1.0000
1.0071
13.0700 1.5000 2.1000 15.5000 98.0000 2.4000 2.6400 0.2800 1.3700 3.7000 1.1800 2.6900 1020.0000 1.0000
1.0992
Sum Squared Error for Training cases = 0.0435
% of Training Cases that meet criteria = 100.0000
81
Running Test Cases
14.3700 1.9500 2.5000 16.8000 113.0000 3.8500 3.4900 0.2400 2.1800 7.8000 0.8600 3.4500 1480.0000 1.0000
0.9942
14.1000 2.1600 2.3000 18.0000 105.0000 2.9500 3.3200 0.2200 2.3800 5.7500 1.2500 3.1700 1510.0000 1.0000
0.9787
13.0500 1.7300 2.0400 12.4000 92.0000 2.7200 3.2700 0.1700 2.9100 7.2000 1.1200 2.9100 1150.0000 1.0000
0.9940
13.7400 1.6700 2.2500 16.4000 118.0000 2.6000 2.9000 0.2100 1.6200 5.8500 0.9200 3.2000 1060.0000 1.0000
0.9816
13.5600 1.7300 2.4600 20.5000 116.0000 2.9600 2.7800 0.2000 2.4500 6.2500 0.9800 3.0300 1120.0000 1.0000
0.9912
12.7200 1.8100 2.2000 18.8000 86.0000 2.2000 2.5300 0.2600 1.7700 3.9000 1.1600 3.1400 714.0000 2.0000
2.0376
12.4700 1.5200 2.2000 19.0000 162.0000 2.5000 2.2700 0.3200 3.2800 2.6000 1.1600 2.6300 937.0000 2.0000
1.9519
11.6100 1.3500 2.7000 20.0000 94.0000 2.7400 2.9200 0.2900 2.4900 2.6500 0.9600 3.2600 680.0000 2.0000
1.9965
11.8700 4.3100 2.3900 21.0000 82.0000 2.8600 3.0300 0.2100 2.9100 2.8000 0.7500 3.6400 380.0000 2.0000
2.0498
12.0700 2.1600 2.1700 21.0000 85.0000 2.6000 2.6500 0.3700 1.3500 2.7600 0.8600 3.2800 378.0000 2.0000
2.0286
12.8600 1.3500 2.3200 18.0000 122.0000 1.5100 1.2500 0.2100 0.9400 4.1000 0.7600 1.2900 630.0000 3.0000
2.9333
13.0800 3.9000 2.3600 21.5000 113.0000 1.4100 1.3900 0.3400 1.1400 9.4000 0.5700 1.3300 550.0000 3.0000
3.0373
14.3400 1.6800 2.7000 25.0000 98.0000 2.8000 1.3100 0.5300 2.7000 13.0000 0.5700 1.9600 660.0000 3.0000
2.9894
13.4800 1.6700 2.6400 22.5000 89.0000 2.6000 1.1000 0.5200 2.2900 11.7500 0.5700 1.7800 620.0000 3.0000
3.0314
13.7100 5.6500 2.4500 20.5000 95.0000 1.6800 0.6100 0.5200 1.0600 7.7000 0.6400 1.7400 740.0000 3.0000
2.9724
Sum Squared Error for Testing cases = 0.0040
% of Testing Cases that meet criteria = 100.0000
82
BIBLIOGRAPHY
1. Neural Computing: THEORY AND PRACTICE By Philip D. Wasserman Anza
Research, Inc.
Publisher: VAN NOSTRAND REINHOLD, 1989
2. Neural Networks – A Systematic Introduction By Raul Rojas
Publisher: Berlin; New York: Springer-Verlag, c1996
3. Machine Learning: Neural Networks, Genetic Algorithms and Fuzzy Systems
By Hojjat Adeli, Shih – Lin Hung
Publisher: John Wiley & Sons, Inc, 1995
4. http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html#What is a
Neural Network
Website Title: NEURAL NETWORKS by Christos Stergiou and Dimitrios
Siganos
Date Accessed 10th June 2010
5. Artificial neural Networks : methods and applications / David J. Livingstone,
editor
Publisher: Totowa, NJ: Humana Press, c2008
Download