Document 16072558

advertisement
x
SELF-SPLITTING NEURAL NETWORK VISUALIZATION TOOL ENHANCEMENTS
Ryan Joseph Norton
B.S., University of California, Davis, 2004
PROJECT
Submitted in partial satisfaction of
the requirements for the degree of
MASTER OF SCIENCE
in
COMPUTER SCIENCE
at
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
FALL
2010
SELF-SPLITTING NEURAL NETWORK VISUALIZATION TOOL ENHANCEMENTS
A Project
by
Ryan Joseph Norton
Approved by:
__________________________________, Committee Chair
V Scott Gordon, Ph.D.
__________________________________, Second Reader
Behnam Arad, Ph.D.
____________________________
Date
ii
Student: Ryan Joseph Norton
I certify that this student has met the requirements for format contained in the University format
manual, and that this project is suitable for shelving in the Library and credit is to be awarded for
the Project.
__________________________, Graduate Coordinator ________________
Nikrouz Faroughi, Ph.D.
Date
Department of Computer Science
iii
Abstract
of
SELF-SPLITTING NEURAL NETWORK VISUALIZATION TOOL ENHANCEMENTS
by
Ryan Joseph Norton
Self-splitting neural networks provide a new method for solving complex problems by using
multiple neural networks in a divide-and-conquer approach to reduce the domain space each
network must solve. However, choosing optimal points for splitting the domain is a difficult
problem. A visualization tool exists to help understand how splitting occurs in the self-splitting
neural network. This project provided several new enhancements to the tool to expand its scope
and improve existing functionality. These enhancements included a new extensible framework
for adding additional learning methods to the algorithm, integrating enhancements to the
algorithm that had been discovered since the original tool was released, and several new features
for observing how the domain space is partitioned. These modifications can be used to develop
further insights into the splitting and training processes.
_______________________, Committee Chair
V Scott Gordon, Ph.D.
_______________________
Date
iv
TABLE OF CONTENTS
Page
List of Tables ................................................................................................................................. vii
List of Figures ............................................................................................................................... viii
Chapter
1. INTRODUCTION ..................................................................................................................... 1
2. BACKGROUND ....................................................................................................................... 2
2.1
Neural Networks .............................................................................................................. 2
2.2
Neural Network Training Algorithms ............................................................................. 2
2.3
Self-Splitting Neural Networks ....................................................................................... 5
2.4
Technologies.................................................................................................................... 7
3. VISUALIZATION ENHANCEMENTS................................................................................... 9
3.1
Training Options .............................................................................................................. 9
3.2
Rewind / Replay Functionality ...................................................................................... 10
3.3
Domain Scaling ............................................................................................................. 11
3.4
Normalization / Grayscale ............................................................................................. 11
3.5
Logging Functionality ................................................................................................... 13
4. SOFTWARE DESIGN ............................................................................................................ 14
4.1
Class Diagrams .............................................................................................................. 14
5. PRELIMINARY RESULTS ................................................................................................... 19
6. CONCLUSIONS ..................................................................................................................... 31
7. FUTURE WORK .................................................................................................................... 33
Appendix A .................................................................................................................................... 35
v
1.
Installation steps ............................................................................................................ 35
2.
User Guide ..................................................................................................................... 35
3.
Steps to add a new scenario ........................................................................................... 35
3.
Steps to add a new training algorithm ........................................................................... 38
5.
Steps to add a new splitting algorithm........................................................................... 39
6.
Steps to add a new fitness function................................................................................ 40
Appendix B .................................................................................................................................... 41
Source Listing.......................................................................................................................... 41
Selected Source Code .............................................................................................................. 42
Bibliography ................................................................................................................................ 115
vi
LIST OF TABLES
Page
1.
Comparison of PSO run ......................................................................................................... 28
2.
Grayscale outputs of various training algorithms. ................................................................. 34
vii
LIST OF FIGURES
Page
1.
Velocity formula for PSO. ....................................................................................................... 4
2.
Trained Region Algorithm ....................................................................................................... 6
3.
Area-Based Binary Splitting .................................................................................................... 7
4.
Dialog window for setting custom parameters. ..................................................................... 10
5.
Replay pane in the GUI.......................................................................................................... 11
6.
Screenshot of global and individual grayscale images. ......................................................... 12
7.
Modular Neural Network Classes. ......................................................................................... 15
8.
Splitting and Training Classes. .............................................................................................. 17
9.
User Interface Classes. ........................................................................................................... 18
10. Generalization Rates for Splitting and Training Algorithms. ................................................ 29
11. Network Size for Splitting and Training Algorithms. ............................................................ 30
12. Generalization Rate for Fitness Functions on PSO. ............................................................... 30
13. Sample scenario file. .............................................................................................................. 36
14. Sample training set. ................................................................................................................ 37
15. Sample XML Results file....................................................................................................... 37
viii
1
Chapter 1
INTRODUCTION
Modular neural networks provide a divide and conquer approach to solving problems that
a single neural network is unable to solve. Because of the "black box" nature of neural networks,
it can be difficult to understand the underlying nature and unintended side effects of using
multiple networks together. An improved understanding of the various characteristics and
generalization ability of these networks may lead to insights in improving the algorithms used.
With this motivation, a visualization tool was developed in 2008 as a senior project by
Michael Daniels, James Boheman, Marcus Watstein, Derek Goering, and Brandon Urban
[Gordon3]. It provided a graphical view into several aspects of the splitting and training
algorithms. In particular, the tool attempted to provide details on the order of the domain
partitioning by the splitting algorithm and the area covered by each individual network.
This project provided several additions to the tool to increase its extensibility, add
multiple splitting and training algorithms, provide more information on the individual networks,
and integrate several improvements to enhance the tool’s reporting functionality.
2
Chapter 2
BACKGROUND
2.1
Neural Networks
Neural networks arose through work in artificial intelligence on simulating neuron
functions in the brain [Russell]. The basic structure of a neural network is a directed graph of
neurons (vertices) connected via weights (edges). Values are entered on the input nodes, undergo
a series of additions and multiplications as they pass through the network structure, and the final
result is saved to the output nodes. Adjustments to the weight values change the data calculations
throughout the network and therefore the final values. For a standard feedforward network,
training is done by running a set of training data consisting of inputs with known outputs through
the network and attempting to determine a set of weights that causes the network results to
sufficiently approximate the known outputs. This approach is called supervised learning.
While neural networks do not provide exact outputs, they can provide close
approximations. Once trained, they run quickly and often generalize well to provide correct
outputs for non-training data. This is useful in situations where an exact algorithm cannot be
determined, or runs too slowly to be useful.
2.2
Neural Network Training Algorithms
The primary complexity in neural networks is in the training algorithm that adjusts the
weights. Backpropagation is a widely used method. It uses a two step approach in which the
output error is calculated for each item in the training set and fed backwards through the network
using stochastic gradient descent [Rojas].
3
The process of adjusting weights to optimize outputs can be easily mapped to a variety of
optimization and search algorithms, and several alternative training approaches have been
researched. These include genetic algorithms [Lu], particle swarm optimization [Hu], and
neighbor annealing [Gordon2]. All of these algorithms require feedback (also known as a fitness
function) about the correctness of their current solution, so that they know when adjustments to
the weights are more or less optimal. One basic approach is a minimum fitness function that
sums the number of training data points that generated outputs outside acceptable error. Weights
that generate fewer erroneous outputs are more “fit”.
2.2.1
Genetic Algorithms
Genetic algorithms were designed to simulate a simple "survival of the fittest"
evolutionary model, where individuals with characteristics that made them more fit were more
likely to pass on portions of their solution to subsequent generations [Russell]. After calculating
fitness for each individual, a selection methodology is used to pick individuals to move on to the
next generation. This selection methodology is weighted towards individuals with better fitness;
most implementations allow the same individual to be chosen more than once. In some
variations, the most fit individual found in any generation is guaranteed a selection -- this is
known as elitism. Once the next generation is chosen, individuals are paired up and portions of
their solutions are swapped at randomly chosen crossover points. Finally, each element of the
individual solution has a small chance to undergo a mutation to a new random value. As the
genetic algorithm runs, individuals with better fitness show up more frequently, leading to more
similar individuals that search a smaller portion of the solution space.
4
2.2.2
Particle Swarm Optimization
Particle swarm optimization (PSO) attempts to simulate swarm intelligence seen in the
flocking behavior found in animals such as birds and bees [Hu]. Particles within the swarm
gravitate towards better solutions, searching these areas more thoroughly. Individual particles
track their current position, velocity and the best position (pbest) they have discovered. A global
best position (gbest) is also tracked. At each iteration, the particle velocity is updated using the
following formula [Hu]:
p.velocity = c0 * p.velocity + c1 * R1 * (p.best - p.position) +
c2 * R2 * (g.best - p.position)
where c0, c1, c2 are constants and R1 and R2 are randomly chosen
from [0.0, 1.0].
Figure 1. Velocity formula for PSO.
The new velocity is then used to update the particle's position. Like genetic algorithms,
in later iterations of the PSO algorithm individual solutions become clustered around the current
best solution, looking for slight improvements.
2.2.3
Neighbor Annealing
Neighbor annealing is a variation of simulated annealing [Gordon2]. A random position
in the search space is chosen. At each iteration of the algorithm, a neighboring position is
randomly chosen. If the neighboring position contains a better solution than the current one, it
becomes the new current position. An annealing schedule is used to adjust how far away the
neighbor is allowed to be. Initially the neighborhood size covers the search space, allowing the
algorithm to jump anywhere. At each iteration the neighborhood size is decreased, eventually
reducing to a form of hill-climbing.
5
2.3
Self-Splitting Neural Networks
The training phase for the neural network is not guaranteed to find an acceptable set of
weights to solve the training data. Sometimes the network is unable to effectively generalize due
to complexity of the training data, insufficient training time, or limitations in the network
structure (i.e. the network lacks sufficient nodes to come up with a realistic model).
Modular neural networks address these issues by partitioning the input domain between
several neural networks. Self-splitting neural networks automate this division process. The ideal
splitting algorithm should provide the best generalization possible while limiting the number of
networks created [Gordon4]. Several splitting approaches have been proposed. The following
are implemented in the visualization tool:
1. Centroid splitting finds the domain dimension with the most distinct values and splits into
roughly equivalent pieces. This approach attempts to equally halve the training set, without
consideration for any partially solved regions of the set. This can be problematic for
networks that are close to learning the entire training set, as splitting the set in half may break
up the points that led to the solution.
2. Trained region attempts to split the set based off the network performance, by ensuring the
largest subset of contiguously solved points in a single dimension, called a chunk, is not split
apart. The split occurs based on where the chunk falls in the training set, using the algorithm
in Figure 2 [Gordon 1]. The smaller unsolved regions should be easier for the new networks
to solve, as they contain less points and therefore less complexity.
6
for each dimension d
sort training set on d
scan each point in the sorted set to determine range of
largest contiguously solved points (chunk)
if chunk is too small
do centroid split
else if chunk falls on an edge of the training set (i.e.
[chunk] [unsolved region])
create solved network for chunk
create unsolved network for unsolved region with
randomized weights
else if chunk falls in the middle of the training set
(i.e. [unsolved region 1] [chunk] [unsolved region 2])
create solved network for chunk
create unsolved network for unsolved region 1 with
randomized weights
create unsolved network for unsolved region 2 with
randomized weights
Figure 2. Trained Region Algorithm
Each new unsolved network in trained region splitting starts with randomized weights. For
cases where the network was close to solving the training set, the new network may waste a
lot of cycles just getting close to the previous network.
3. Area-based binary splitting tries to solve this problem by adjusting the trained region
algorithm. For cases where the chunk falls in the middle of the training set, the algorithm in
Figure 3 is used [Gordon4].
7
else if chunk falls in the middle of the training set (i.e.
[unsolved region 1] [chunk] [unsolved region 2])
create unsolved network for ([smaller unsolved region] +
[chunk]), starting with weights from parent network
create unsolved network for [larger unsolved region] with
randomized weights
Figure 3. Area-Based Binary Splitting
For a network that has nearly solved the training set, the combination of fewer unsolved
points and more cycles to fine-tune the weights should improve the network's ability to solve
the set. In the worst case, area-based binary splitting may take extra cycles and do the same
split as trained region splitting.
2.4
Technologies
The visualization tool was initially developed in Java using the Swing toolkit. This
approach was kept for this project, as significant development effort would be required to
transition to a new language or toolkit without an obvious benefit. Java allowed the tool to be
developed for deployment without regard for target platforms - this is handled natively by the
Java interpreter.
Swing is a widget toolkit included as part of the Java Foundation Classes to provide a
graphical user interface API. It provides a large set of cross-platform GUI components that
provide a consistent look and feel. Swing components also proved to be highly customizable,
allowing for fast and straightforward development of the GUI interface [Fowler].
The program was grouped into several Java packages that cover the various splitting and
training algorithms, the graphical user interface (GUI), and the underlying modular neural
network structure. These packages went through a large amount of refactoring over the course of
8
the project; the final structures can be seen in more detail in (Section 4.1- Class Diagrams).
Communication between the GUI, splitting, and training packages were handled by interfaces and
event handlers, but did not account for the extra parameters required by different algorithms.
9
Chapter 3
VISUALIZATION ENHANCEMENTS
3.1
Training Options
To improve the usefulness of the visualization tool, a large section of the underlying
framework was rewritten to allow for adding new algorithms. These include splitting algorithms,
neural network training algorithms, and fitness functions. An interface was expanded upon or
developed for each, along with corresponding hooks into the GUI to enable the end user to
choose the appropriate algorithm and load or save pre-built scenario files. For more detail on the
changes, see (Section 4 - Software Design). Refer to Appendix A for steps to add new algorithms
to the program.
10
Figure 4. Dialog window for setting custom parameters.
3.2
Rewind / Replay Functionality
As the training set is partitioned into smaller sets the amount of time required to solve
each subset tends to decrease. This makes observing the order of splitting difficult towards the
end of the training run. A slider was added to the GUI to allow the user to rewind or fast-forward
through individual networks of the last trained self-splitting neural network (SSNN) and observe
various characteristics of the currently selected network.
11
Figure 5. Replay pane in the GUI.
3.3
Domain Scaling
In the initial implementation of the visualization tool, inputs to the individual networks
were not scaled separately. This can make solving networks with very close training subsets
difficult, as the network must make very small changes in weights. The tool was modified to
scale each training subset input to [0,1] to solve this problem. For special cases where one
dimension contains all the same value (causing infinite scaling and divide by zero issues), the
scaling factors from the split network are assigned to the new networks.
3.4
Normalization / Grayscale
While each individual network has a specified subset of the input domain, networks that
12
generalized well may actually be able to solve a larger portion of the training set. Insights in this
area may lead to creating better splitting algorithms that can take advantage of how much each
individual network is really capable of solving. A previous update to the visualization tool had
provided grayscale results for the SSNN [Gordon4]. Each pixel in the viewing windows was fed
into the SSNN and the output scaled to a grayscale integer value. The resulting image provided
an easy to understand visual mapping of the network's outputs across the domain. This grayscale
enhancement was merged with the new normalization code, allowing the user to also see how
each individual network in the SSNN would solve the entire domain.
Figure 6. Screenshot of global and individual grayscale images.
13
3.5
Logging Functionality
While the primary usage of the visualization tool involves direct manipulation by the
user, it is also useful to gather details on each run for later analysis. A logging system was built
to dump information on run settings and testing results to a comma-separated values (CSV) file.
The log provides enough data to recreate the same run later if the user finds something they want
to revisit. It also allows for easy aggregation and plotting of the data.
14
Chapter 4
SOFTWARE DESIGN
Several interfaces and abstract classes were developed to improve the code modularity.
This streamlines the process of adding new training, splitting, and fitness algorithms and reduces
the amount of coding required for future enhancements.
Training algorithms are all located in the edu.csus.ecs.ssnn.nn.trainer package. At a
minimum, all training algorithms are required to implement the TrainingAlgorithmInterface.
This defines the functions required for training and reporting results. All the current training
algorithms also extend the TrainingAlgorithm base class, which provides several common helper
functions. For providing GUI options around each training algorithm,
edu.csus.ecs.ssnn.ui.TrainingParametersDialog.java provides the front-end dialog box, while
edu.csus.ecs.ssnn.data.TrainingParameters.java defines the backend code used to create new
training algorithm objects. A generic Params object is used in the TrainingParameters.java file to
reduce the change required to add new parameters - only the dialog and training algorithm files
need to be adjusted. To make the same options work with saved and loaded scenario files,
edu.csus.ecs.ssnn.fileio.SSNNDataFile.java contains code to push/pull the options to/from XML.
4.1
Class Diagrams
Three subsections of the class diagram are provided in the following figures. These
sections constitute the core functionality of the program – covering the main user interface, the
modular neural network, and the splitting and training classes.
15
Figure 7. Modular Neural Network Classes.
16
17
Figure 8. Splitting and Training Classes.
18
Figure 9. User Interface Classes.
19
Chapter 5
PRELIMINARY RESULTS
After the additional algorithms were implemented in the tool, several sample runs were
done to demonstrate how the tool can be used to gather data on the various algorithms. Initial
tests were run on the two-spiral problem as it is a difficult problem for neural networks to solve
[Gordon3] and provides simple, easy to generate data sets. Runs were done with Particle Swarm
Optimization, area based splitting, largest chunk fitness, with a hard-coded seed and all other
settings left on their default values. Run details and visualization images can be seen in Table 1.
Run 1 solved quickly, but all networks were simple and most encompassed a single
portion of one spiral. Individual networks showed simplistic grayscales images. Adding a second
4-node hidden layer for run 2 saw individual networks begin creating more complex patterns.
A third 4-node hidden layer was added for run 3. It did not improve the final SSNN.
Total networks increased while the generalization rate went down. There were several cases of
the network generating unnecessary complexity for the region it was covering. In this case, it
appears the network's extra complexity worked against it, as the final individually solved
networks did not cover appreciably larger areas and their extra complexity reduced the ability for
the training algorithm to generalize correctly, as seen in the grayscale images of the individual
networks.
For run 4 iterations were increased from 300,000 to 1,000,000. This had a positive effect
on the three layer network, as networks produced went down and several networks covered larger
areas. However, grayscale images did not show a noticeable increase in complexity.
20
Run 5 attempted the same number of iterations on the two layer network, which
decreased its performance. This could be due to the more simplistic neural network being unable
to generate a sufficiently complex model to cover larger regions. Because these tests were using
area based splitting, which causes some new networks to start off with current weight values, they
may have started with an overly complex model for the new, smaller region, leading to a lower
generalization rate. Further research is required to determine this.
For run 6, iterations were increased to 2,000,000. This did not reduce the networks
produced, but the extra iterations clearly took advantage of the network's extra complexity to
model more accurate curves along larger regions. Run 7 increased iterations to 3,000,000. The
network count was reduced, but generalization went down.
The networks were increased to four layers each with four nodes for run 8. This did not
improve generalization, but the networks appear to start covering larger, more complex regions.
Since total networks do not significantly decrease, there may be a shift towards both larger and
smaller region networks, reducing the uniformity in network size found in the fewer layer
networks. Further analysis of the training log could yield more information on the variation of
region sizes between the different networks.
21
Run
1
Network
One 4-node hidden layer
Max Iterations
300,000
Generalization
.98769
Network Count
59
Results
Duration: 19.0 sec.
Total iterations: 17561305
Chunk splits: 56
Centroid splits: 2
Size failures: 2
22
Run
2
Network
Two 4-node hidden layers
Max Iterations
300,000
Generalization
0.99148
Network Count
57
Results
Duration: 31.0 sec.
Total iterations: 17509249
Chunk splits: 55
Centroid splits: 1
Size failures: 1
23
Run
3
Network
Three 4-node hidden layers
Max Iterations
300,000
Generalization
0.98580
Network Count
62
Results
Duration: 51.0 sec.
Total iterations: 19682785
Chunk splits: 61
Centroid splits: 0
Size failures: 0
Generalization rate:
24
Run
4
Network
Three 4-node hidden layers
Max Iterations
1,000,000
Generalization
0.98106
Network Count
51
Results
Duration: 133.0 sec.
Total iterations: 52336342
Chunk splits: 50
Centroid splits: 0
Size failures: 0
25
Run
5
Network
Two 4-node hidden layers
Max Iterations
1,000,000
Generalization
0.98580
Network Count
62
Results
Duration: 114.0 sec.
Total iterations: 61591067
Chunk splits: 59
Centroid splits: 2
Size failures: 2
No images collected
26
Run
6
Network
Three 4-node hidden layers
Max Iterations
2,000,000
Generalization
0.99053
Network Count
59
Results
Duration: 335.0 sec.
Total iterations: 119159666
Chunk splits: 58
Centroid splits: 0
Size failures: 0
27
Run
7
Network
Three 4-node hidden layers
Max Iterations
3,000,000
Generalization
0.98769
Network Count
52
Results
Duration: 399.0 sec.
Total iterations: 159121244
Chunk splits: 51
Centroid splits: 0
Size failures: 0
No images collected
28
Run
8
Network
Four 4-node hidden layers
Max Iterations
3,000,000
Generalization
.98300
Network Count
54
Results
Duration: 523.0 sec.
Total iterations: 160727717
Chunk splits: 50
Centroid splits: 3
Size failures: 3
Table 1. Comparison of PSO run
29
The logging system was used to track several runs of each algorithm for comparison
purposes. The sample size was small – ten runs on each algorithm – but the intent was to
demonstrate how run statistics can be quickly gathered and synthesized into useful information.
The run log was used to collect run data for each combination of the available splitting and
training algorithms (using default splitting and training settings).
Figure 10. Generalization Rates for Splitting and Training Algorithms.
30
Figure 11. Network Size for Splitting and Training Algorithms.
A second set of runs was done on PSO comparing the fitness function performance.
Again, all default settings were used and ten runs were done on each setting.
Figure 12. Generalization Rate for Fitness Functions on PSO.
31
Chapter 6
CONCLUSIONS
The project required a large amount of code refactoring to improve the program's
extensibility and add all the features mentioned above. Much of the first half of the project was
spent defining a logical class hierarchy and moving logic to the appropriate location. Over the
course of this project, several changes were made as features were tested and found to have
varying usefulness.
Domain scaling in particular went through several iterations. Testing with the initial
implementation found networks solved poorly or had errors in situations where a particular
network had a domain dimension consisting of one point (i.e. {(1.5, -10.0), (1.5, 10.)}).
Attempting to scale to [1.5, 1.5] caused a loss of information in that dimension, and any attempts
to scale back out caused a divide by zero error. The solution was to assign the same scaling used
by the parent network when this situation arose in the inputs.
Initially, outputs were also scaled. This proved to be less useful for some training sets,
such as the two spiral problem, because training outputs were either 0 or 1. Training subsets
output domains were {0}, {1}, or {0,1}, and scaling a subset of all zeroes (or ones) to [0,1]
caused the error calculation to become meaningless. In general, a set of training data with
identical output leads to a useless network, as it usually generalizes to either all inputs passing or
all inputs failing. Future splitting algorithms may want to ensure that splits do not leave any
training subsets with only one distinct output.
Initial testing of the domain scaling code showed a marked improvement in training.
Without scaling, training would occasionally fail when network splits would lead to a grouping of
32
three or four closely spaced points with differing outputs. While networks can still fail, with
domain scaling the frequency is much less.
The normalization code that tested how well an individual network solved all points in
the training set also went through a few iterations. Initially it just displayed what points from the
training set were able to be solved. This turned out to be pretty useless, as no real insight could
be gained by looking at a scattering of points across the domain. However, applying this code to
the grayscale image generation provides a much better sense of what the network solution looks
like, and should prove to be a useful feature.
The replay functionality was very useful over the course of this project. The ability to
look closely at an individual network made debugging the new splitting and training algorithms
much easier – several subtle errors were caught by reviewing the output of individual networks.
While this was not the primary purpose of the feature, it demonstrated how a different view on
the data can provide new insights.
33
Chapter 7
FUTURE WORK
This project focused on expanding the capabilities and improving the extensibility of the
visualization tool. These enhancements allow for a variety of new research; possibilities include:

A splitting algorithm that ensures all split regions contain at least two distinct output
values.

Performance of a sum-squared error fitness function.

Analyzing and comparing new neural network training algorithms.

Analyzing and comparing new modular neural network splitting algorithms.

Analyzing and comparing new fitness functions.
The visualization tool provides a framework for future research into splitting, training,
and fitness algorithms and their affect on self-splitting neural networks. Over the course of this
project, several new algorithms were added and tested. Several example images generated by the
different training algorithms are shown below in Table 2. However, at this point more data is
needed to characterize any differences.
34
Particle Swarm
Genetic Algorithm
Neighbor Annealing
Optimization
Run
1
Run
2
Table 2. Grayscale outputs of various training algorithms.
There are also a few remaining enhancements that could prove useful. The tool can only
display two input dimensions and one output dimension, on higher dimension problems the user
should have functionality to choose which dimensions to draw. There may also be better ways to
display higher dimension problems. Also, fitness functions are not fully integrated with the
logging and pre-defined scenario files – improving this process would allow more detailed
logging and reduce the steps necessary to setup a new training run.
35
APPENDIX A
1.
Installation steps
Run all commands from SSNN\bin directory.
Compile:
javac -d "." -classpath "..\lib\jdom.jar;..\lib\swing-layout-1.0.3.jar"
-sourcepath "..\src" "..\src\edu\csus\ecs\ssnn\ui\MainGUI.java"
Create JAR:
jar cvf vis.jar edu
Run:
java -classpath "..\lib\jdom.jar;..\lib\swing-layout-1.0.3.jar;vis.jar"
edu.csus.ecs.ssnn.ui.MainGUI
2.
User Guide
The program should be run with the included SSNN\bin directory as the working
directory. This directory should contain the following files and folders:
manual.xml
- used to populate the embedded manual in the program.
scenarios\
scenario"
- holds scenario files, detailed in "Steps to add a new
A manual is provided with the program. It can be accessed from the menu via Help ->
Manual. The manual provides information on the individual UI elements, as well as explanations
of the various algorithms used. It is pulled from the manual.xml file located in the /bin folder.
3.
Steps to add a new scenario
Scenarios are used to load predefined algorithm settings and data sets, reducing the
number of steps needed to kick off training runs. The following example shows how to add a
new scenario called "My scenario".
36
1. Create a new folder in SSNN\bin\scenarios\ called "My scenario".
# Name to display
Name= My scenario
# Description to display
Description= Sample scenario
# Training data set
TrainingData= myTrainingSet.dat
# Testing data set
TestingData= myTestingSet.dat
# XML file that contains information on preset
algorithms and parameters.
ResultsFile= myscenario.xml
# max/min vals for dimensions of training/testing sets.
ScaleIn0= -10,10
ScaleIn1= -10,10
ScaleOut0= 0,1
# Description of network, each layer is described:
# hiddenTopology= num_nodes_in_layer
# e.g. this is a 1 layer 4 node Neural Network:
hiddenTopology= 4
Figure 13. Sample scenario file.
2. Create a text file called scenario.txt with the following content:
37
3. Provide training and testing data file (in this case named myTrainingSet.dat and
myTestingSet.dat). These files provide a simple list of training/testing points, provided
as a space delimited list of inputs followed by outputs:
9.97616 0.35630 1
-9.97616 -0.35630 0
9.93965 0.71090 1
-9.93965 -0.71090 0
9.89056 1.06335 1
-9.89056 -1.06335 0
9.82900 1.41320 1
Figure 14. Sample training set.
4. Provide the XML results file. It provides details on the various preset options chosen for
this scenario. Example file content is below:
<?xml version="1.0" encoding="UTF-8"?>
<SSNNDataFile>
<TrainingResults Solved="false" TrainingDuration="0.0"
ChunksProduced="-1" ChunkSplits="0" HalfSplits="0" SizeFailures="0"
TrainingIterations="0">
<TrainingParameters SplitMode="CHUNK SPLIT"
SplitAlgorithm="Trained Region" TrainingAlgorithm="Back Propagation"
MaxRuns="300000" MinChunkSize="3" RandomSeed="0"
SuccessCriteria="0.4" LearningRate="0.3" Momentum="0.9"
fitness="Largest Chunk" />
</TrainingResults>
<TestingResults GeneralizationRate="-1.0">
<TestingParameters CorrectnessCriteria="0.4" />
</TestingResults>
<VisualizationOptions ShowBackgroundImage="false" ShowAxes="true"
ShowColors="true" ShowPoints="true" />
</SSNNDataFile>
Figure 15. Sample XML Results file.
38
3. Steps to add a new training algorithm
The best approach is to use an existing training algorithm as a template while using the
following steps to ensure a particular function is not skipped.
1. Add the new training algorithm in a .java file in the edu.csus.ecs.ssnn.nn.trainer package. It
must implement TrainingAlgorithmInterface. It is recommended to extend the
TrainingAlgorithm base class. The instructions assume this has been done.
2. Update edu.csus.ecs.ssnn.data.TrainingParameters.java:
a. Add a new entry to the AlgorithmType_Map hash map. This allows the
algorithm to show up in the GUI drop-down for training algorithms. It is also
used to identify the algorithm in any saved/loaded scenario files and log files.
b. Add a new public variable for MyAlgorithm.Params.
c. Update TrainingParameters() to include to initialize the new params object.
d. Update the switch statement in getTrainingAlgorithmObject() to return a new
instance of the training algorithm.
3. Update edu.csus.ecs.ssnn.fileio.SSNNDataFile.java:
a. In the save() function update the switch statement on training algorithms to set
XML strings corresponding to each Params element.
b. In the loadFrom() function update the switch statement on training algorithms to
set the Params object from the XML strings defined in save().
4. Update edu.csus.ecs.ssnn.ui.TrainingParametersDialog.java
a. Add a new card (JPanel) to algorithmPanel.jPanel_AlgorithmOptions. This card
will contain any settable parameters specific to the algorithm.
b. Make sure to set the Card Name (under Properties) to the same string used in
TrainingParameters.AlgorithmType_Map. This allows the auto-populated
39
combobox to correctly display the card.
c. Add a new help string for the algorithm.
d. Wire up the MouseEnter properties for the card to display the help string.
e. Update getTrainingParameters() to set the Params object from the custom UI
elements defined on the card.
f.
Update presetControls() to set the custom UI elements defined on the card from
the Params object.
5. Steps to add a new splitting algorithm
Adding a new splitting algorithm follows a similar approach to training algorithms, with
a few steps removed. This is because the splitting algorithms do not take any extra parameters
that must be accounted for. Note - this may change for future splitting algorithms, which would
require refactoring some areas of code, most likely to add a similar parameter passing system to
that used for the training algorithms.
1. Add the new splitting algorithm in a .java file in the edu.csus.ecs.ssnn.nn.splittrainer
package. It must implement SSNNTrainingAlgorithmInterface. It is recommended to
extend the SSNNTrainingAlgorithm base class. The instructions assume this has been
done.
2. Update edu.csus.ecs.ssnn.data.TrainingParameters.java:
a. Add a new entry to the SplitAlgorithmType_Map hash map. This allows the split
algorithm to show up in the GUI drop-down for splitting algorithms. It is also
used to identify the algorithm in any saved/loaded scenario files and log files.
b. Update the switch statement in getSplitTrainingAlgorithmObject to return a new
instance of the splitting algorithm (with appropriate options set).
40
6. Steps to add a new fitness function
Fitness functions follow a similar flow to the other algorithms. Because they are not used
by all training functions (such as backpropagation), they are considered as one of the custom
parameters that can be set by individual training algorithms. Currently, fitness functions are not
being saved or loaded by scenario files or reported in log files. This is primarily a workaround to
the inability to pass function references in Java, and could be addressed with future
enhancements. Adding a new fitness function requires the following steps.
1. Add the new fitness algorithm in a .java file in the edu.csus.ecs.ssnn.nn.trainer.fitness
package. It must implement FitnessInterface.
2. Update edu.csus.ecs.ssnn.data.TrainingParameters.java by adding a new entry to
FitnessFunction_Map hash map. Note that this approach differs from the training
algorithms in that an instance of the fitness function is created.
41
APPENDIX B
Full source code is too long to provide in this paper. Selected source files are available
below. The complete source can be downloaded from the Mercurial repository at
http://www.digitalxen.net/school/project/.
Source Listing
Underline files have included source code below.
edu.csus.ecs.ssnn.data.ScalingParameters.java
edu.csus.ecs.ssnn.data.ScenarioProperties.java
edu.csus.ecs.ssnn.data.TestingParameters.java
edu.csus.ecs.ssnn.data.TestingResults.java
edu.csus.ecs.ssnn.data.TrainingParameters.java
edu.csus.ecs.ssnn.data.TrainingResults.java
edu.csus.ecs.ssnn.data.VisualizationOptions.java
edu.csus.ecs.ssnn.data.XMLTreeModel.java
edu.csus.ecs.ssnn.event.NetworkConvergedEvent.java
edu.csus.ecs.ssnn.event.NetworkSplitEvent.java
edu.csus.ecs.ssnn.event.NetworkTrainerEventListenerInterface.java
edu.csus.ecs.ssnn.event.NetworkTrainingEvent.java
edu.csus.ecs.ssnn.event.SplitTrainerEventListenerInterface.java
edu.csus.ecs.ssnn.event.TestingCompletedEvent.java
edu.csus.ecs.ssnn.event.TestingProgressEventListenerInterface.java
edu.csus.ecs.ssnn.event.TrainingCompletedEvent.java
edu.csus.ecs.ssnn.event.TrainingProgressEvent.java
edu.csus.ecs.ssnn.event.TrainingProgressEventListenerInterface.java
edu.csus.ecs.ssnn.event.fileio.SSNNDataFile.java
edu.csus.ecs.ssnn.event.fileio.SSNNLogging.java
edu.csus.ecs.ssnn.nn.DataFormatException.java
edu.csus.ecs.ssnn.nn.DataPair.java
edu.csus.ecs.ssnn.nn.DataRegion.java
edu.csus.ecs.ssnn.nn.DataSet.java
edu.csus.ecs.ssnn.nn.IncompatibleDataException.java
edu.csus.ecs.ssnn.nn.ModularNeuralNetwork.java
edu.csus.ecs.ssnn.nn.NNRandom.java
edu.csus.ecs.ssnn.nn.NetworkLayer.java
edu.csus.ecs.ssnn.nn.NetworkNode.java
edu.csus.ecs.ssnn.nn.NeuralNetwork.java
edu.csus.ecs.ssnn.nn.NeuralNetworkException.java
edu.csus.ecs.ssnn.nn.trainer.BackPropagation.java
edu.csus.ecs.ssnn.nn.trainer.GeneticAlgorithm.java
edu.csus.ecs.ssnn.nn.trainer.NeighborAnnealing.java
edu.csus.ecs.ssnn.nn.trainer.ParticleSwarmOptimization.java
edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithm.java
edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface.java
edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface.java
edu.csus.ecs.ssnn.nn.trainer.fitness.LargestChunk.java
edu.csus.ecs.ssnn.nn.trainer.fitness.SimpleUnsolvedPoints.java
42
edu.csus.ecs.ssnn.splittrainer.AreaBasedBinarySplitTrainer.java
edu.csus.ecs.ssnn.splittrainer.NoSplitTrainer.java
edu.csus.ecs.ssnn.splittrainer.RandomSplitterTrainer.java
edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithm.java
edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithmInterface.java
edu.csus.ecs.ssnn.splittrainer.TrainedResultsSplitTrainer.java
edu.csus.ecs.ssnn.ui.CreateSSNNDialog.java
edu.csus.ecs.ssnn.ui.HelpDialog.java
edu.csus.ecs.ssnn.ui.LoadDataSetDialog.java
edu.csus.ecs.ssnn.ui.LoadNetworkFromFileDialog.java
edu.csus.ecs.ssnn.ui.LoadPredefinedScenarioDialog.java
edu.csus.ecs.ssnn.ui.LoadSavedOptionsDialog.java
edu.csus.ecs.ssnn.ui.LoadTestingSetDialog.java
edu.csus.ecs.ssnn.ui.LoadTrainingSetDialog.java
edu.csus.ecs.ssnn.ui.Log.java
edu.csus.ecs.ssnn.ui.LogSettingsDialog.java
edu.csus.ecs.ssnn.ui.MainGUI.java
edu.csus.ecs.ssnn.ui.SSNNVisPanel.java
edu.csus.ecs.ssnn.ui.SaveToFileDialog.java
edu.csus.ecs.ssnn.ui.ScalingParameters.java
edu.csus.ecs.ssnn.ui.TestingParametersDialog.java
edu.csus.ecs.ssnn.ui.TestingResultsDialog.java
edu.csus.ecs.ssnn.ui.TrainingParametersDialog.java
edu.csus.ecs.ssnn.ui.TrainingResultsDialog.java
edu.csus.ecs.ssnn.ui.dialoghelpers.CSVFileFilter.java
edu.csus.ecs.ssnn.ui.dialoghelpers.FilePreview.java
edu.csus.ecs.ssnn.ui.dialoghelpers.XMLFileFilter.java
Selected Source Code
/* ===================================================================
edu.csus.ecs.ssnn.data.TrainingParameters.java
=================================================================== */
package edu.csus.ecs.ssnn.data;
import
import
import
import
import
java.util.*;
edu.csus.ecs.ssnn.nn.DataFormatException;
edu.csus.ecs.ssnn.nn.trainer.*;
edu.csus.ecs.ssnn.nn.trainer.fitness.*;
edu.csus.ecs.ssnn.splittrainer.*;
public class TrainingParameters {
public static enum SplitType {
CENTROID_SPLIT,
CHUNK_SPLIT
};
public static enum SplitAlgorithmType {
NO_SPLIT,
TRAINED_REGION,
AREA_BASED_BINARY_SPLIT
}
// shortcut for connecting string to AlgorithmType enum.
// new algorithms need to be added to AlgorithmType and this map
public static final Map<String, SplitAlgorithmType> SplitAlgorithmType_Map =
new HashMap<String, SplitAlgorithmType>() {{
43
put("Area Based Binary Split",
SplitAlgorithmType.AREA_BASED_BINARY_SPLIT);
put("Trained Region", SplitAlgorithmType.TRAINED_REGION);
put("No Split", SplitAlgorithmType.NO_SPLIT);
}};
public static enum AlgorithmType {
BACK_PROPAGATION,
GENETIC_ALGORITHM,
PARTICLE_SWARM_OPTIMIZATION,
NEIGHBOR_ANNEALING
}
// shortcut for connecting string to AlgorithmType enum.
// new algorithms need to be added to AlgorithmType and this map
public static final Map<String, AlgorithmType> AlgorithmType_Map =
new HashMap<String, AlgorithmType>() {{
put("Back Propagation", AlgorithmType.BACK_PROPAGATION);
put("Genetic Algorithm", AlgorithmType.GENETIC_ALGORITHM);
put("Particle Swarm Optimization",
AlgorithmType.PARTICLE_SWARM_OPTIMIZATION);
put("Neighbor Annealing", AlgorithmType.NEIGHBOR_ANNEALING);
}};
public
public
public
public
BackPropagation.Params BackPropagation_Params;
GeneticAlgorithm.Params GeneticAlgorithm_Params;
NeighborAnnealing.Params NeighborAnnealing_Params;
ParticleSwarmOptimization.Params ParticleSwarmOptimization_Params;
// shortcut for connecting string to FitnessFunction enum.
// new algorithms need to be added to FitnessFunction and this map
public static final Map<String, FitnessInterface> FitnessFunction_Map =
new HashMap<String, FitnessInterface>() {{
put("Unsolved Points", new SimpleUnsolvedPoints());
put("Largest Chunk", new LargestChunk());
}};
// splitting algorithm for generating subnetworks within the modular NN
private SplitAlgorithmType splitAlgorithmType;
// Type of splitting to use.
private SplitType splitMode;
// Algorithm to use to find weight values for neural network
private AlgorithmType algorithmType;
// The maximum number of runs allowed before training fails as unsolved.
private int maxRuns;
// The minimum neural network size allowed from any split.
private int minChunkSize;
// Seed used in the random number generator.
private int randomSeed;
// How close to the goal a value has to be to be considered correct.
private double successCriteria;
44
/**
* Creates a new instance of TrainingParameters setting all
* parameters to default values.
*/
public TrainingParameters() {
this(SplitType.CHUNK_SPLIT, SplitAlgorithmType.TRAINED_REGION,
AlgorithmType.BACK_PROPAGATION, 0, 0, 0, 0.0);
}
public TrainingParameters(
SplitType splitMode,
SplitAlgorithmType splitAlgorithm,
AlgorithmType algorithm,
int maxRuns,
int minChunkSize,
int randomSeed,
double successCriteria) {
this.BackPropagation_Params = new BackPropagation.Params();
this.GeneticAlgorithm_Params = new GeneticAlgorithm.Params();
this.NeighborAnnealing_Params = new NeighborAnnealing.Params();
this.ParticleSwarmOptimization_Params = new ParticleSwarmOptimization.Params();
this.splitAlgorithmType = splitAlgorithm;
this.splitMode = splitMode;
this.algorithmType = algorithm;
this.maxRuns = maxRuns;
this.minChunkSize = minChunkSize;
this.randomSeed = randomSeed;
this.successCriteria = successCriteria;
}
public
public
public
public
public
public
public
SplitAlgorithmType getSplitAlgorithmType() { return splitAlgorithmType; }
AlgorithmType getAlgorithmType() { return algorithmType;
}
int
getMaxRuns()
{ return maxRuns;
}
int
getMinChunkSize()
{ return minChunkSize;
}
int
getRandomSeed()
{ return randomSeed;
}
SplitType getSplitMode()
{ return splitMode;
}
double
getSuccessCriteria()
{ return successCriteria;
}
public TrainingAlgorithmInterface getTrainingAlgorithmObject(int numOutputs)
throws DataFormatException {
double[] scaledCriteria = new double[numOutputs];
scaledCriteria[0] = successCriteria;
switch (getAlgorithmType()) {
case BACK_PROPAGATION:
return new BackPropagation(BackPropagation_Params, maxRuns,
scaledCriteria);
case GENETIC_ALGORITHM:
return new GeneticAlgorithm(GeneticAlgorithm_Params, maxRuns,
scaledCriteria);
case PARTICLE_SWARM_OPTIMIZATION:
return new ParticleSwarmOptimization(ParticleSwarmOptimization_Params,
maxRuns, scaledCriteria);
case NEIGHBOR_ANNEALING:
return new NeighborAnnealing(NeighborAnnealing_Params, maxRuns,
scaledCriteria);
default:
45
return null;
}
}
public SSNNTrainingAlgorithmInterface getSplitTrainingAlgorithmObject(int numOutputs)
throws DataFormatException {
double[] scaledCriteria = new double[numOutputs];
scaledCriteria[0] = successCriteria;
// if centroid split, set min chunk size to force only centroids
int adjustedMinChunkSize;
if (getSplitMode() == TrainingParameters.SplitType.CENTROID_SPLIT) {
adjustedMinChunkSize = Integer.MAX_VALUE;
} else {
adjustedMinChunkSize = minChunkSize;
}
switch(getSplitAlgorithmType()) {
case AREA_BASED_BINARY_SPLIT:
AreaBasedBinarySplitTrainer area_trainer = new
AreaBasedBinarySplitTrainer();
area_trainer.setMinChunkSize(adjustedMinChunkSize);
area_trainer.setAcceptableErrors(scaledCriteria);
return area_trainer;
case TRAINED_REGION:
TrainedResultsSplitTrainer region_trainer = new
TrainedResultsSplitTrainer();
region_trainer.setMinChunkSize(adjustedMinChunkSize);
region_trainer.setAcceptableErrors(scaledCriteria);
return region_trainer;
case NO_SPLIT:
default:
NoSplitTrainer none_trainer = new NoSplitTrainer();
return none_trainer;
}
}
public void setSplitAlgorithmType(SplitAlgorithmType algorithm) {
this.splitAlgorithmType = algorithm;
}
public void setAlgorithmType(AlgorithmType algorithm) {
this.algorithmType = algorithm;
}
public void setMaxRuns(int maxRuns) {
this.maxRuns = maxRuns;
}
public void setMinChunkSize(int minChunkSize) {
this.minChunkSize = minChunkSize;
}
public void setRandomSeed(int randomSeed) {
this.randomSeed = randomSeed;
}
public void setSplitMode(SplitType splitMode) {
this.splitMode = splitMode;
46
}
public void setSuccessCriteria(double successCriteria) {
this.successCriteria = successCriteria;
}
}
/* ===================================================================
edu.csus.ecs.ssnn.event.NetworkSplitEvent.java
=================================================================== */
package edu.csus.ecs.ssnn.event;
import java.util.EventObject;
import java.util.ArrayList;
import java.util.List;
import edu.csus.ecs.ssnn.nn.DataRegion;
/**
* This event is fired off whenever networks are split by the splitting algorithm.
*/
public class NetworkSplitEvent extends EventObject {
/**
* List of possible split types
*/
public enum SplitType {
chunk,
centroid
}
private double percentConverged
of networks.
private int chunkSize;
private int networksConverged;
private int networksInQueue;
private int splitDimension;
private int totalNetworks;
= 0.0;
//
//
//
//
//
// Ratio of converged network to total number
Size of the current network's chunk
Number of networks that have converged.
Number of networks waiting to be trained
Dimension across which the split(s) happened
Total number of networks in the modular network
private SplitType splitType = SplitType.chunk;
// List of region(s) that were solved (if any)
private List<DataRegion> solvedRegions = new ArrayList<DataRegion>();
// List of regions that were not solved
private List<DataRegion> unsolvedRegions = new ArrayList<DataRegion>();
// List of points at which the network was split
private List<Double> splitPoints = new ArrayList<Double>();
// Creates a new instance of SplitHappened.
/**
*
* @param source
*
Source that fired the event.
* @param chunkSize
*
The size of the current network's chunk.
* @param networksInQueue
*
Number of networks waiting to be trained.
* @param networksConverged
47
*
The number of networks that have converged.
* @param splitDimension
*
Dimension across which the split(s) happened.
* @param splitType
*
Type of split that occured.
* @param totalNetworks
*
Total number of networks in the modular network.
* @param percentConverged
*
The ratio of converged network to total number of networks.
* @param unsolvedRegions
*
List of regions that were not solved.
* @param solvedRegions
*
List of region(s) that were solved (if any).
* @param splitPoints
*
List of points at which the network was split.
*/
public NetworkSplitEvent(Object source, int chunkSize, int networksInQueue, int
networksConverged,
int splitDimension, SplitType splitType, int totalNetworks, double
percentConverged,
List<DataRegion> unsolvedRegions, List<DataRegion> solvedRegions,
List<Double> splitPoints) {
super(source);
this.chunkSize = chunkSize;
this.networksInQueue = networksInQueue;
this.networksConverged = networksConverged;
this.splitDimension = splitDimension;
this.splitType = splitType;
this.totalNetworks = totalNetworks;
this.percentConverged = percentConverged;
this.unsolvedRegions = unsolvedRegions;
this.solvedRegions = solvedRegions;
this.splitPoints = splitPoints;
}
/**
*
* @return The size of the current network's chunk
*/
public int getChunkSize() {
return (chunkSize);
}
/**
*
* @return Number of networks waiting to be trained
*/
public int getNetworksInQueue() {
return networksInQueue;
}
/**
*
* @return The number of networks that have converged.
*/
public int getNetworksConverged() {
return (networksConverged);
}
/**
*
* @return The ratio of converged network to total number of networks.
*/
48
public double getPercentConverged() {
return percentConverged;
}
/**
*
* @return List of region(s) that were solved (if any)
*/
public List<DataRegion> getSolvedRegions() {
return solvedRegions;
}
/**
*
* @return Dimension across which the split(s) happened
*/
public int getSplitDimension() {
return splitDimension;
}
/**
*
* @return List of points at which the network was split
*/
public List<Double> getSplitPoints() {
return splitPoints;
}
/**
*
* @return Type of split that occurred
*/
public SplitType getSplitType() {
return splitType;
}
/**
*
* @return Total number of networks in the modular network
*/
public int getTotalNetworks() {
return totalNetworks;
}
/**
*
* @return List of regions that were not solved
*/
public List<DataRegion> getUnsolvedRegions() {
return unsolvedRegions;
}
}
/* ===================================================================
edu.csus.ecs.ssnn.event.NetworkTrainerEventListenerInterface.java
=================================================================== */
package edu.csus.ecs.ssnn.event;
/**
* Listener interface for training neural networks.
* and completing network training.
*/
It provides for starting
49
public interface NetworkTrainerEventListenerInterface {
/**
* Handler for the training completed event. This method is called when the
* modular neural network is done with the training set, either by successfully
* solving all points, or by reaching a failure condition.
*
* @param e
*
event object.
*/
public void trainingCompleted(TrainingCompletedEvent e);
/**
* Handler for a network training event. This method is called when a new neural
* network begins to train.
*
* @param e
*
event object.
*/
public void networkTraining(NetworkTrainingEvent e);
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.ModularNeuralNetwork.java
=================================================================== */
package edu.csus.ecs.ssnn.nn;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
* Modular network that contains an internal list of networks and the regions
* they solve.
*/
public class ModularNeuralNetwork implements Iterable<ModularNeuralNetwork.NetworkRecord>
{
private
private
private
private
private
private
solving
ArrayList<NetworkRecord> networks;
int numInputs;
int numOutputs;
Object trainingData;
DataRegion inputRange;
int lockedNetwork;
// if set, will only use particular network for
/**
* Constructs an empty modular neural network with the given number of
* inputs and outputs.
*
* @param inputs
*
Inputs to the network.
* @param outputs
*
Outputs of the network.
* @param inputRange
*
Total region final modular network is expected to cover.
*/
public ModularNeuralNetwork(int inputs, int outputs, DataRegion inputRange) {
numInputs = inputs;
50
numOutputs = outputs;
this.inputRange = inputRange;
networks = new ArrayList<ModularNeuralNetwork.NetworkRecord>();
lockedNetwork = -1;
}
public
public
public
public
public
int
int
int
DataRegion
Object
getLockedNetwork()
getNumInputs()
getNumOutputs()
getInputRange()
getTrainingData()
{
{
{
{
{
return
return
return
return
return
lockedNetwork; }
numInputs; }
numOutputs; }
inputRange; }
trainingData; }
/**
* @return number of solved networks in the modular neural network.
*/
public int getSolvedCount() {
int solvedCount = 0;
for(int i=0; i< networks.size(); i++) {
if(networks.get(i).isSolved()) {
solvedCount++;
}
}
return solvedCount;
}
public void setTrainingData(Object o) {
trainingData = o;
}
/**
* Tells the modular neural network to only use the passed network for
* solving points - regardless of the data region.
* @param network_id
*
Id of network to use.
*/
public void setLockedNetwork(int network_id) {
lockedNetwork = network_id;
}
/**
* Adds a sub-network to the modular network which solves a given region.
*
* @param n
*
the new sub-network
* @param r
*
the region the network is responsible for
* @throws NeuralNetworkException
*
throws error when the number of network inputs don't match the region
*/
public void addNetwork(NeuralNetwork n, DataRegion r)
throws NeuralNetworkException {
if (n.getNumInputs() != r.getDimensions() ||
n.getNumInputs() != numInputs ||
r.getDimensions() != numInputs) {
throw new NeuralNetworkException("Network has mismatching inputs\n"
+ "Network inputs: " + n.getNumInputs() + "\n"
+ "Region dimensions: " + r.getDimensions() + "\n"
+ "Modular Network inputs: " + numInputs);
}
networks.add(new NetworkRecord(n, r));
51
}
/**
* overloaded version for setting solved status (useful for already solved split
networks)
* @param n
* @param r
* @param solved
* @throws NeuralNetworkException
*/
public void addNetwork(NeuralNetwork n, DataRegion r, boolean solved) throws
NeuralNetworkException {
addNetwork(n, r);
setSolvedStatus(networks.size()-1, solved);
}
/**
* Returns a list of the NetworkRecords with DataRegions that contain a
* point.
*
* @param pointCoords
*
coordinates of the point to find
* @return a list of NetworkRecords containing DataRegions which contain the
*
given point
*/
private List<NetworkRecord> findRecordsAtPoint(List<Double> pointCoords) throws
IncompatibleDataException {
ArrayList<NetworkRecord> found = new ArrayList<NetworkRecord>();
for (int i = 0; i < networks.size(); i++) {
if (networks.get(i).getDataRegion().containsPoint(pointCoords)) {
found.add(networks.get(i));
}
}
return found;
}
/**
* Gets a list of DataRegions that contain a given input point.
*
* @param pointCoords
*
an input point
* @return a list of regions that contain the point
* @throws IncompatibleDataException
*/
public List<DataRegion> getDataRegionsAtPoint(List<Double> pointCoords) throws
IncompatibleDataException {
List<NetworkRecord> matchingRecords = findRecordsAtPoint(pointCoords);
List<DataRegion> matchingRegions = new ArrayList<DataRegion>();
for (NetworkRecord r : matchingRecords) {
matchingRegions.add(r.getDataRegion());
}
return matchingRegions;
}
/**
* @param id
*
Index of network.
* @return Data region associated with the specified network.
52
*/
public DataRegion getDataRegion(int id) {
return networks.get(id).getDataRegion();
}
/**
* @param id
*
Index of network.
* @return Neural network associated with the specified network id.
*/
public NeuralNetwork getNetwork(int id) {
return networks.get(id).getNeuralNetwork();
}
/**
* @return Count of all networks in the modular neural network.
*/
public int getNetworkCount() {
return networks.size();
}
/**
* Gets a list of NeuralNetworks that are responsible for handling inputs at
* the given input point.
*
* @param pointCoords
*
an input point
* @return a list of networks that are responsible for providing outputs for
*
that point
* @throws IncompatibleDataException
*/
public List<NeuralNetwork> getNetworksAtPoint(List<Double> pointCoords) throws
IncompatibleDataException {
List<NetworkRecord> matchingRecords = findRecordsAtPoint(pointCoords);
List<NeuralNetwork> matchingNetworks = new ArrayList<NeuralNetwork>();
for (NetworkRecord r : matchingRecords) {
matchingNetworks.add(r.getNeuralNetwork());
}
return matchingNetworks;
}
/**
* @param region
*
Region to search for.
* @return Network associated with the specified data region.
*/
public NeuralNetwork getNetworkForRegion(DataRegion region) {
for (NetworkRecord r : networks) {
if (r.getDataRegion() == region) {
return r.getNeuralNetwork();
}
}
return null;
}
/**
* @param id
*
Index of network.
53
* @return Network record for associated id.
*/
public NetworkRecord getNetworkRecord(int id) {
return networks.get(id);
}
/**
* Returns a list of outputs for a given set of inputs. An appropriate list
* of networks is selected to feed the inputs into, and the outputs from
* these networks are averaged to produce the final outputs.
*
* @param inputs
*
the list of input values, which must match the
*
ModularNeuralNetwork's number of inputs
* @return the list of output values
* @throws NeuralNetworkException
*/
public List<Double> getOutputs(List<Double> inputs) throws NeuralNetworkException {
if (inputs.size() != numInputs) {
throw new NeuralNetworkException("Incorrect number of inputs to
ModularNeuralNetwork"
+ "Input values: " + inputs.size()
+ ", network inputs: " + numInputs);
}
List<NeuralNetwork> matchingNetworks;
// check for locked network. If locked, only use 1 network for solving
if( lockedNetwork >= networks.size() ) {
throw new NeuralNetworkException("Locked network outside of networks range");
} else if(lockedNetwork >= 0) {
List<NeuralNetwork> n = new ArrayList<NeuralNetwork>();
n.add(networks.get(lockedNetwork).getNeuralNetwork());
matchingNetworks = n;
} else {
matchingNetworks = getNetworksAtPoint(inputs);
}
ArrayList<Double> finalOutputs = new ArrayList<Double>();
double sums[] = new double[numOutputs];
for (int i = 0; i < numOutputs; i++) {
sums[i] = 0;
}
for (NeuralNetwork n : matchingNetworks) {
List<Double> outputs = n.getOutputs(inputs);
for (int i = 0; i < numOutputs; i++) {
sums[i] += outputs.get(i);
}
}
for (int i = 0; i < numOutputs; i++) {
finalOutputs.add(sums[i] / matchingNetworks.size());
}
return finalOutputs;
}
/**
* Gets the region that a given network can solve.
*
54
* @param network
*
the network to find the region for.
* @return the region solved by the network
*/
public DataRegion getRegionForNetwork(NeuralNetwork network) {
for (NetworkRecord r : networks) {
if (r.getNeuralNetwork() == network) {
return r.getDataRegion();
}
}
return null;
}
/**
* @param id
*
Index of network.
* @return
*
Whether specified network has been solved.
*/
public boolean getSolvedStatus(int id) {
return networks.get(id).isSolved();
}
/**
* @return Iterator to list of NetowrkRecords that hold the network and data region.
*/
public synchronized Iterator<NetworkRecord> iterator() {
return networks.iterator();
}
/**
* Removes a sub-network from the modular network and the associated region.
* If there is no exactly matching sub-network in the modular network,
* nothing is removed.
*
* @param network
*
the network to remove
*/
public void removeNetwork(NeuralNetwork network) {
NetworkRecord removeMe = null;
for (NetworkRecord r : networks) {
if (r.getNeuralNetwork() == network) {
removeMe = r;
break;
}
}
if (removeMe != null) {
networks.remove(removeMe);
}
}
/**
* Removes a region from the modular network and the associated sub-network.
* If there is no exactly matching region in the modular network, nothing is
* removed.
*
* @param region
*
the region to remove
*/
public void removeRegion(DataRegion region) {
55
NetworkRecord removeMe = null;
for (NetworkRecord r : networks) {
if (r.getDataRegion() == region) {
removeMe = r;
break;
}
}
if (removeMe != null) {
networks.remove(removeMe);
}
}
/**
*
* @param id
*
Index of network
* @param solved
*
Network's solved status
*/
public void setSolvedStatus(int id, boolean solved) {
networks.get(id).setSolved(solved);
}
/**
* A NetworkRecord associates a NeuralNetwork with a DataRegion. The network
* in the record is the one that can give correct outputs for inputs in the
* associated region of the input domain.
*/
public class NetworkRecord {
private NeuralNetwork n;
private DataRegion r;
// region of the input domain associated with
neural network
private DataSet solvedPoints;
// all points (including outside of domain) that
are solved
private DataSet unsolvedPoints; // all points (including outside of domain) that
can't be solved
private int[] grayscalePoints; // holds grayscale calculations for reuse
private boolean solved;
private int trainingSetSize;
// saves # of points in training set
/**
* Creates a new empty NetworkRecord
*/
public NetworkRecord() {
n = null;
r = null;
solvedPoints = null;
solved = false;
trainingSetSize = -1;
}
/**
*
* @param n
*
Neural network to add to record.
* @param r
*
Data region to add to record.
* @throws NeuralNetworkException
*/
public NetworkRecord(NeuralNetwork n, DataRegion r) throws NeuralNetworkException
{
56
this();
setRecordData(n, r);
}
public
public
public
public
public
public
public
NeuralNetwork
DataRegion
DataSet
DataSet
int
int[]
boolean
getNeuralNetwork()
getDataRegion()
getSolvedDataPoints()
getUnsolvedDataPoints()
getTrainingSetSize()
getGrayscalePoints()
isSolved()
{
{
{
{
{
{
{
return
return
return
return
return
return
return
n; }
r; }
solvedPoints; }
unsolvedPoints; }
trainingSetSize; }
grayscalePoints; }
solved; }
public void setSolved(boolean solved) {
this.solved = solved;
}
public void setSolvedDataPoints(DataSet points) {
this.solvedPoints = points;
}
public void setUnsolvedDataPoints(DataSet points) {
this.unsolvedPoints = points;
}
public void setTrainingSetSize(int size) {
this.trainingSetSize = size;
}
public void setGrayscalePoints(int[] points) {
this.grayscalePoints = points;
}
/**
* Associates network n with region r
*
* @param n
*
the neural network that handles inputs in region r
* @param r
*
the region of the input domain that the neural network can solve.
* @throws NeuralNetworkException
*
Throws error if number of network inputs don't match the data
region.
*/
public void setRecordData(NeuralNetwork n, DataRegion r)
throws NeuralNetworkException {
if(n.getNumInputs() != r.getDimensions()) {
throw new NeuralNetworkException("Network inputs do not match region
dimensions"
+ "Network inputs: " + n.getNumInputs() + "\n"
+ "Region dimensions: " + r.getDimensions());
}
this.n = n;
this.r = r;
}
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.NetworkLayer.java
=================================================================== */
package edu.csus.ecs.ssnn.nn;
57
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
* A layer of the network, containing individual nodes.
*/
public class NetworkLayer implements Iterable<NetworkNode> {
// List of nodes in the layer. This does not include a bias node.
private ArrayList<NetworkNode> nodes;
// Helper for training algorithms. This field is not used by the instance and
// serves only as a place for a training algorithm to store data about the instance.
private Object trainingData;
/**
* Constructs a new NetworkLayer containing the given number of cells, all
* of which have previousLayerSize number of weights. All of the weights are
* initialized to zero. To set initial random weights, call
* randomizeWeights.
*
* @param cells
*
the number of cells in this layer
* @param previousLayerSize
*
the number of cells in the previous layer
*/
public NetworkLayer(int cells, int previousLayerSize) {
nodes = new ArrayList<NetworkNode>();
for (int i = 0; i < cells; i++) {
nodes.add(new NetworkNode(previousLayerSize));
}
}
/**
* Assigns small random values to each of the layer's nodes' input weights.
*
* @param weightMin
*
Minimum allowed weight value for the network.
* @param weightMax
*
Maximum allowed weight value for the network.
* @param percOfSpace
*
Allowed initial weight distance from midpoint.
* @see NetworkNode.randomizeWeights()
* @see NeuralNetwork.randomizeWeights()
*/
public void randomizeWeights(double weightMin, double weightMax, double percOfSpace)
{
for (NetworkNode n : nodes) {
n.randomizeWeights(weightMin, weightMax, percOfSpace);
}
}
public Object getTrainingData() {
return trainingData;
}
public void setTrainingData(Object trainingData) {
this.trainingData = trainingData;
}
58
/**
* Gets a particular node inside the layer.
*
* @param node
*
the number of the node to get
* @return the requested network node
*/
public NetworkNode getNode(int node) {
return nodes.get(node);
}
/**
* Gets an Iterator over the nodes in the layer.
*
* @see java.lang.Iterable#iterator()
*/
public Iterator<NetworkNode> iterator() {
return nodes.iterator();
}
/**
* @return number of nodes in the layer.
*/
public int getNodeCount() {
return nodes.size();
}
/**
* Calculates list of output values for the layer given a set of inputs.
*
* @param inputs
*
Inputs to the layer - must match number of weights to layer.
* @return Calculated outputs from the layer.
* @throws NeuralNetworkException
*/
public List<Double> getOutputs(List<Double> inputs) throws NeuralNetworkException {
ArrayList<Double> layerOutputs = new ArrayList<Double>();
for (NetworkNode n : nodes) {
layerOutputs.add(n.getOutput(inputs));
}
return layerOutputs;
}
/**
* @return list of weights pulled from all nodes in the layer.
*/
public List<Double> getWeights() {
List<Double> weights = new ArrayList<Double>();
for(NetworkNode node: nodes) {
weights.addAll(node.getWeights());
}
return weights;
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.NetworkNode.java
59
=================================================================== */
package edu.csus.ecs.ssnn.nn;
import
import
import
import
java.util.ArrayList;
java.util.Iterator;
java.util.List;
java.util.Random;
/**
* An individual node in the neural network, which is primarily defined
* by the list of weights connecting it to the bias node and and nodes in the
* previous layer.
*
*/
public class NetworkNode implements Iterable<Double> {
// List of weights of nodes in the previous network layer.
private ArrayList<Double> inWeights;
// Weight of the bias node.
private double biasWeight;
// Helper for training algorithms. This field is not used by the instance and
// serves only as a place for a training algorithm to store data about the instance.
private Object trainingData;
// The last computed output value from the NetworkNode.
// Useful to training algorithms.
private double lastOutput;
/**
* Constructs a NetworkNode with a supplied number of input weights, all
* initialized to zero. Call randomizeWeights to assign small random values
* to the weights.
*
* @param inputNodes
*
the number of inputs there are to this node
*/
public NetworkNode(int inputNodes) {
inWeights = new ArrayList<Double>();
for (int i = 0; i < inputNodes; i++) {
inWeights.add(0.0);
}
biasWeight = 0;
trainingData = null;
lastOutput = 0;
}
/**
* Assigns random values to each of the node's input weights. The values
* are scaled to fall around the midpoint of [weightMin, weightMax], using
* scaleRatio to determine how far away from the midpoint is allowed.
*
* @param weightMin
*
Minimum weight value.
* @param weightMax
*
Maximum weight value.
* @param percOfSpace
60
*
How far away from midpoint of weights new random value is
allowed.
*
(0.1 = 10%)
* @see NetworkLayer.randomizeWeights()
* @see NeuralNetwork.randomizeWeights()
*/
public void randomizeWeights(double weightMin, double weightMax, double percOfSpace)
{
Random r = NNRandom.getRandom();
double weightDelta = (weightMax - weightMin) * percOfSpace;
double scaledMin = (weightMax - weightMin)/2.0 + weightMin - weightDelta/2.0;
for (int i = 0; i < inWeights.size(); i++) {
double weight = r.nextDouble() * weightDelta + scaledMin;
inWeights.set(i, weight);
}
biasWeight = r.nextDouble() * weightDelta + scaledMin;
}
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
public double getBiasWeight()
{ return biasWeight;
public double getLastOutput()
{ return lastOutput;
public Object getTrainingData() { return trainingData;
}
}
}
public void setBiasWeight(double newWeight) {
biasWeight = newWeight;
}
public void setTrainingData(Object trainingData) {
this.trainingData = trainingData;
}
// </editor-fold>
/**
* Set a particular weight value on the node.
*
* @param weightId
*
Zero-based weight index.
* @param newWeight
*
Weight value to set.
*/
public void setWeight(int weightId, double newWeight) {
inWeights.set(weightId, newWeight);
}
/**
* Gets a particular weight value from the node.
* @param weightId
*
Zero-based weight index.
* @return Weight value
*/
public double getWeight(int weightId) {
return inWeights.get(weightId);
}
/**
* @return List of all weights on the node, including bias weight (at the end).
*/
public List<Double> getWeights() {
61
// include bias node
List<Double> weights = (List<Double>)inWeights.clone();
weights.add(biasWeight);
return weights;
}
/**
* Gets the number of weight values this node is tracking. The getOutput
* methods need exactly as many inputs as the value returned by this method.
* @return count of all weight values for the node, including bias node.
*/
public int getWeightCount() {
return inWeights.size() + 1;
// accounting for bias node
}
/**
*
* @return Iterator over the node's weights.
*/
public Iterator<Double> iterator() {
return inWeights.iterator();
}
/**
* Computes the output of the node based on the given inputs and the stored
* weights.
*
* @param inputValues
*
a list of input values to the node, one for each weight
* @return the output value of the node
* @throws NeuralNetworkException
*/
public double getOutput(List<Double> inputValues) throws NeuralNetworkException {
if (inputValues.size() != inWeights.size()) {
throw new NeuralNetworkException("Number of inputs don't match nodes."
+ "Input values: " + inputValues.size()
+ ", weights: " + inWeights.size());
}
return computeOutput(inputValues);
}
/**
* Helper to the getOutput methods which does the computation and returns
* the output value.
*
* @param inputs
*
a list of doubles representing the inputs to the node
* @return the computed output value
*/
private double computeOutput(List<Double> inputs) {
double sum = 0;
int i = 0;
for (double in : inputs) {
sum += in * inWeights.get(i);
i++;
}
sum += biasWeight; // Bias node which always outputs 1.
double logisticOutput = 1.0 / (1.0 + Math.exp(-sum));
lastOutput = logisticOutput;
return logisticOutput;
62
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.NeuralNetwork.java
=================================================================== */
package edu.csus.ecs.ssnn.nn;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
* Defines the structure of the neural network, including all layers and nodes
* and the weights between them. Also handles all necessary input scaling.
*/
public class NeuralNetwork implements Iterable<NetworkLayer> {
// A list of the layers in the network. This includes the hidden layers and
// output layer, but no input layer.
private ArrayList<NetworkLayer> layers;
// The number of inputs this neural network can accept.
private int numInputs;
// The number of weights in this neural network.
// Calculated when network is built.
private int numWeights;
// Helper for training algorithms. This field is not used by the instance and
// serves only as a place for a training algorithm to store data about the instance.
private Object trainingData;
// max/min weight values can be customized.
private double weightMax;
private double weightMin;
Available to training algorithms
// store mins/maxes for all points in set
private DataRegion _scalingInputRegion;
private ArrayList<Double> _inLocalMins;
private ArrayList<Double> _inGlobalMins;
private ArrayList<Double> _inSlopes;
/**
* Constructs a new neural network with the given topology, with all
* connections between nodes having a default weight of 0. To initialize the
* weights with random values, call randomizeWeights.
*
* @param inputs
*
the number of inputs to the network
* @param outputs
*
the number of outputs from the network
* @param HiddenTopology
*
a list of node counts for each hidden layer. Can be null or
*
empty for no hidden layers.
*/
public NeuralNetwork(int inputs, int outputs, List<Integer> HiddenTopology) {
layers = new ArrayList<NetworkLayer>();
numInputs = inputs;
weightMax = 20.0;
weightMin = -20.0;
63
//
//
//
//
An input layer isn't needed because we never have to actually
calculate anything using it. When the network does a computation, the
inputs will be supplied to the first hidden layer (or to the output
layer if there are no hidden layers).
int lastLayerSize = inputs;
// Hidden layers
if (HiddenTopology != null) {
for (Integer i : HiddenTopology) {
layers.add(new NetworkLayer(i, lastLayerSize));
numWeights += i*(lastLayerSize+1); // weights between last layer and
current (accounting for bias node)
lastLayerSize = i;
}
}
// Output layer
layers.add(new NetworkLayer(outputs, lastLayerSize));
numWeights += outputs*(lastLayerSize+1);
}
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
public int
getNumInputs()
{ return numInputs; }
public int
getNumWeights()
{ return numWeights; }
public Object getTrainingData() { return trainingData; }
public double getWeightMax()
{ return weightMax; }
public double getWeightMin()
{ return weightMin; }
public DataRegion getScalingRegion() { return _scalingInputRegion; }
public void setTrainingData(Object trainingData) {
this.trainingData = trainingData;
}
public void setWeightMax(double weightMax) {
this.weightMax = weightMax;
}
public void setWeightMin(double weightMin) {
this.weightMin = weightMin;
}
// </editor-fold>
/**
* Initializes all the weights in the network to random values between
* the min/max allowed weight values.
*
* @see NetworkLayer.randomizeWeights()
* @see NetworkNode.randomizeWeights()
*/
public void randomizeWeights(double percOfSpace) {
for (NetworkLayer l : layers) {
l.randomizeWeights(weightMin, weightMax, percOfSpace);
}
}
/**
* Generates outputs from the neural network given a list of inputs.
*
* @param inputs
*
the input values to the network
* @return a list of outputs from the network
64
* @throws NeuralNetworkException on incorrect number of inputs to the network.
*/
public List<Double> getOutputs(List<Double> inputs) throws NeuralNetworkException {
// Make sure there's a correct number of inputs before beginning.
if (inputs.size() != numInputs) {
throw new NeuralNetworkException("Incorrect number of inputs to network.");
}
// scale inputs
List<Double> nextLayerInputs = scale(inputs, _inLocalMins, _inGlobalMins,
_inSlopes);
for (NetworkLayer l : layers) {
nextLayerInputs = l.getOutputs(nextLayerInputs);
}
// scale outputs
List<Double> outputs = nextLayerInputs;
return outputs;
}
/**
* Scales the point using the network specific scaling factors.
*
* @param values
*
The values that make up the data point.
* @param domainMins
*
Minimum domain values. Usually the training set mins.
* @param rangeMins
*
Minimum range values. Usually the minimum of the data region
*
covered by the network.
* @param slopes
*
Previously calculated slopes of of delta(range)/delta(domain)
* @return
*
Scaled point
* @throws NeuralNetworkException on mismatched dimensions.
*/
protected static List<Double> scale(List<Double> values, List<Double> domainMins,
List<Double> rangeMins, List<Double> slopes)
throws NeuralNetworkException {
if( (values.size() != domainMins.size()) || (domainMins.size() !=
rangeMins.size()) ) {
throw new NeuralNetworkException("Unable to scale points - dimension
mismatch");
}
List<Double> scaled = new ArrayList<Double>();
for(int i=0; i < values.size(); i++) {
scaled.add( (values.get(i) - domainMins.get(i)) * slopes.get(i) +
rangeMins.get(i) );
}
return scaled;
}
/**
*
* @param layer
*
Zero-based index to layer in the neural network.
*/
65
public NetworkLayer getLayer(int layer) {
return layers.get(layer);
}
/**
* Gets the number of layers in the network, including the hidden and output
* layers but NOT an input layer.
*
* @return the number of hidden and output layers in the network
*/
public int getLayerCount() {
return this.layers.size();
}
public List<Double> getWeights() {
List<Double> weights = new ArrayList<Double>();
for(NetworkLayer layer: layers) {
weights.addAll(layer.getWeights());
}
return weights;
}
/**
* @return an Iterator over the network layers.
*/
public Iterator<NetworkLayer> iterator() {
return layers.iterator();
}
/**
* @return count of outputs to the Neural Network (i.e. nodes in last layer).
*/
public int getNumOutputs() {
return layers.get(layers.size()-1).getNodeCount();
}
/**
* @return node count of each hidden layer
*/
public List<Integer> getHiddenTopology() {
ArrayList<Integer> topology = new ArrayList<Integer>();
// stopping before last layer to avoid including output layer
for(int i=0; i<layers.size()-1; i++) {
topology.add(layers.get(i).getNodeCount());
}
return topology;
}
/**
* Sets input domain scaling for network
* Saves scaling factor: (Range Max - Range Min) / (Domain Max - Domain Min)
* and min value for each input dimension.
*
* @param local
*
DataRegion for sub network inputs
* @param global
*
DataRegion for entire modular neural network inputs
* @throws IncompatibleDataException on dimension mismatch.
*/
public void setInputScaling(DataRegion local, DataRegion global)
66
throws IncompatibleDataException {
_scalingInputRegion = local;
_inLocalMins = new ArrayList<Double>();
_inGlobalMins = new ArrayList<Double>();
_inSlopes = new ArrayList<Double>();
// want to scale from local back up to global
setScaling(_inLocalMins, _inGlobalMins, _inSlopes, local, global);
}
/**
* Determines scaling factor (Range Max - Range Min) / (Domain Max - Domain Min)
* for each dimension, and saves the dimension minimums required to do scaling.
*
* @param domainMins
*
Array to save domain mins to.
* @param rangeMins
*
Array to save range mins to.
* @param slopes
*
Array to save slopes (scaling factors) to.
* @param domain
*
Domain to calculate scaling from.
* @param range
*
Range to calculate scaling from.
*
* @throws IncompatibleDataException on dimension mismatch.
*/
protected static void setScaling(ArrayList<Double> domainMins, ArrayList<Double>
rangeMins, ArrayList<Double> slopes, DataRegion domain, DataRegion range)
throws IncompatibleDataException {
// check dimensions
if( domain.getDimensions() != range.getDimensions() ) {
throw new IncompatibleDataException("Dimensions for network scaling don't
match");
}
for(int i=0; i < domain.getDimensions(); i++) {
domainMins.add(i, domain.getMin(i));
rangeMins.add(i, range.getMin(i));
slopes.add(i, (range.getMax(i) - range.getMin(i)) / (domain.getMax(i) domain.getMin(i)));
}
}
/**
* Loads array of weights into the neural network
*
* @param weights
*
Array of weights (must match network node weight count)
*/
public void setWeights(Double[] weights) {
// set weights on nodes in first and hidden layers
int p_index = -1;
int prevLayerCount = getNumInputs();
for (int layerNum = 0; layerNum < getLayerCount(); layerNum++) {
NetworkLayer currLayer = getLayer(layerNum);
// For each node in the layer:
for (int currNodeNum = 0; currNodeNum < currLayer.getNodeCount();
currNodeNum++) {
NetworkNode currNode = currLayer.getNode(currNodeNum);
currNode.setBiasWeight(weights[++p_index]);
67
// For each weight from curr node to last layer nodes
for (int lastNodeNum = 0; lastNodeNum < prevLayerCount; lastNodeNum++) {
currNode.setWeight(lastNodeNum, weights[++p_index]);
}
}
prevLayerCount = currLayer.getNodeCount();
}
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.BackPropagation.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer;
import edu.csus.ecs.ssnn.nn.*;
import java.util.ArrayList;
import java.util.List;
/**
* This class implements the backpropagation training algorithm for single
* neural networks.
*/
public class BackPropagation extends TrainingAlgorithm {
public static class Params {
protected double learningRate;
protected double momentum;
// Back propogation learning rate
// Back propogation momentum. 0 disables.
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
public double
public double
getLearningRate()
getMomentum()
{ return this.learningRate;
{ return this.momentum;
}
}
public void setLearningRate(double learningRate) {
this.learningRate = learningRate;
}
public void setMomentum(double momentum) {
this.momentum = momentum;
}
// </editor-fold>
}
protected Params params;
public BackPropagation(Params params, int maxIterations, double[] acceptableError) {
super(maxIterations, acceptableError);
this.params = params;
}
public boolean train(NeuralNetwork n, DataSet d) throws NeuralNetworkException {
VerifyInputs(n, d);
// Put training data object in each node:
68
int inputCount = n.getNumInputs();
for (NetworkLayer l : n) {
for (NetworkNode nn : l) {
nn.setTrainingData(new BackPropNodeData(inputCount));
}
inputCount = l.getNodeCount();
}
int correctInARow = 0;
boolean converged = false;
int dataIndex = 0;
iterations = 0;
// train until stop conditions met
while (iterations < maxIterations && !converged) {
// Reset errors on all network nodes:
for (NetworkLayer l : n) {
for (NetworkNode nn : l) {
((BackPropNodeData) nn.getTrainingData()).error = 0;
}
}
List<Double> inputs = d.getPair(dataIndex).getInputs();
// Forward pass:
List<Double> outs = n.getOutputs(inputs);
// Compute output errors:
double[] errors = new double[d.getNumOutputs()];
for (int i = 0; i < d.getNumOutputs(); i++) {
// Error = desired output - actual output
errors[i] = (d.getPair(dataIndex).getOutput(i) - outs.get(i));
}
// See if all outputs are within acceptable error
boolean correctOutput = true;
for (int i = 0; i < errors.length; i++) {
if (Math.abs(errors[i]) > acceptableError[i]) {
correctOutput = false;
break;
} else {
if(DEBUG) {
System.out.println("Hit!");
System.out.flush();
}
}
}
if (correctOutput) {
correctInARow++;
// must get correct output on all items of dataset
if (correctInARow >= d.size()) {
converged = true;
continue;
}
} else {
if(DEBUG) {
if (correctInARow != 0) {
System.out.println("Chain broke at " + correctInARow);
}
}
correctInARow = 0;
}
69
// Start of backward pass
// Adjust output errors:
NetworkLayer outputLayer = n.getLayer(n.getLayerCount() - 1);
for (int i = 0; i < outputLayer.getNodeCount(); i++) {
NetworkNode currentNode = outputLayer.getNode(i);
BackPropNodeData currNodeData = (BackPropNodeData)
currentNode.getTrainingData();
currNodeData.error = errors[i] * currentNode.getLastOutput() * (1 currentNode.getLastOutput());
}
// Adjust hidden layer errors:
// For each layer (going backwards from the last hidden layer):
for (int layerNum = n.getLayerCount() - 2; layerNum >= 0; layerNum--) {
NetworkLayer currLayer = n.getLayer(layerNum);
NetworkLayer nextLayer = n.getLayer(layerNum + 1);
// For each node in the current layer:
for (int currLayerNodeNum = 0; currLayerNodeNum <
currLayer.getNodeCount(); currLayerNodeNum++) {
NetworkNode currentNode = currLayer.getNode(currLayerNodeNum);
BackPropNodeData currNodeData = (BackPropNodeData)
currentNode.getTrainingData();
// For each node in the next layer:
for (int nextLayerNodeNum = 0; nextLayerNodeNum <
nextLayer.getNodeCount(); nextLayerNodeNum++) {
NetworkNode nextNode = nextLayer.getNode(nextLayerNodeNum);
BackPropNodeData nextNodeData = (BackPropNodeData)
nextNode.getTrainingData();
currNodeData.error += nextNode.getWeight(currLayerNodeNum)
* nextNodeData.error
* currentNode.getLastOutput()
* (1 - currentNode.getLastOutput());
}
}
}
// Start of forward weight adjustment
// Adjust the first layer based on the inputs:
NetworkLayer firstLayer = n.getLayer(0);
for (int currNodeNum = 0; currNodeNum < firstLayer.getNodeCount();
currNodeNum++) {
NetworkNode currNode = firstLayer.getNode(currNodeNum);
BackPropNodeData currNodeData = (BackPropNodeData)
currNode.getTrainingData();
for (int inputNum = 0; inputNum < inputs.size(); inputNum++) {
// Update the weight from the input layer to this layer:
double weightDelta = params.learningRate * inputs.get(inputNum)
* currNodeData.error
+ currNodeData.lastWeightChanges.get(inputNum)
* params.momentum;
currNode.setWeight(inputNum, currNode.getWeight(inputNum) +
weightDelta);
// Update the "previous change" for momentum:
currNodeData.lastWeightChanges.set(inputNum, weightDelta);
}
70
// Update bias node weight too:
double biasWeightDelta = params.learningRate
* currNodeData.error
+
currNodeData.lastWeightChanges.get(currNodeData.lastWeightChanges.size() - 1)
* params.momentum;
currNode.setBiasWeight(currNode.getBiasWeight() + biasWeightDelta);
currNodeData.lastWeightChanges.set(
currNodeData.lastWeightChanges.size() - 1,
biasWeightDelta);
}
// Adjust remaining layers based on previous layer's outputs:
// For each layer:
for (int layerNum = 1; layerNum < n.getLayerCount(); layerNum++) {
NetworkLayer currLayer = n.getLayer(layerNum);
NetworkLayer prevLayer = n.getLayer(layerNum - 1);
// For each node in the layer:
for (int currNodeNum = 0; currNodeNum < currLayer.getNodeCount();
currNodeNum++) {
NetworkNode currNode = currLayer.getNode(currNodeNum);
BackPropNodeData currNodeData = (BackPropNodeData)
currNode.getTrainingData();
// For each node in the previous layer:
for (int lastNodeNum = 0; lastNodeNum < prevLayer.getNodeCount();
lastNodeNum++) {
NetworkNode prevNode = prevLayer.getNode(lastNodeNum);
// Update weight between currNode and prevNode:
double weightDelta = params.learningRate
* prevNode.getLastOutput()
* currNodeData.error
+ currNodeData.lastWeightChanges.get(lastNodeNum)
* params.momentum;
currNode.setWeight(lastNodeNum, currNode.getWeight(lastNodeNum) +
weightDelta);
// Update the "previous change" for momentum:
currNodeData.lastWeightChanges.set(lastNodeNum, weightDelta);
}
// Update bias node weight too:
double biasWeightDelta = params.learningRate
* currNodeData.error
+
currNodeData.lastWeightChanges.get(currNodeData.lastWeightChanges.size() - 1)
* params.momentum;
currNode.setBiasWeight(currNode.getBiasWeight() + biasWeightDelta);
currNodeData.lastWeightChanges.set(
currNodeData.lastWeightChanges.size() - 1,
biasWeightDelta);
}
}
ReportProgress();
71
dataIndex = ++dataIndex % d.size();
iterations++;
}
cleanNetwork(n);
// Clean up training data
return converged;
}
/**
* BackPropNodeData objects are stored inside network nodes and are used to
* carry information useful to the backpropagation training algorithm.
*/
private class BackPropNodeData {
private double error = 0;
// Computed output error of the node.
// private double lastChange = 0;
// List of previous single changes to each weight. Used for momentum.
private ArrayList<Double> lastWeightChanges;
private BackPropNodeData(int inputs) {
lastWeightChanges = new ArrayList<Double>();
for (int i = 0; i < inputs; i++) {
lastWeightChanges.add(0d);
}
lastWeightChanges.add(0d); // Last one is for bias node weight
}
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.GeneticAlgorithm.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer;
import java.util.Arrays;
import java.util.Random;
import edu.csus.ecs.ssnn.nn.*;
import edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface;
/**
*
* Genetic Algorithm to replace backpropogation for training NN.
*
* An Individual consists of one set of data points that contain all the weights
* for the network.
*
* Fitness function is determined by setting weights from individual values and
* running through the training data comparing outputs.
*
1st classification is # of entries outside of acceptable.
*/
public class GeneticAlgorithm extends TrainingAlgorithm {
public static
protected
protected
protected
into genSize
class Params {
int genSize;
int tournamentSize;
int winnerCount;
// number of individuals in generation.
// # of individuals selected for a tournament
// # of winners from a tournament. Must divide
72
protected
protected
protected
protected
double mutationChance;
boolean elitism;
crossoverType crossover;
FitnessInterface fitnessFxn;
// chosen fitness function
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
public boolean getElitism()
{ return this.elitism; }
public int
getGenerationSize()
{ return this.genSize; }
public int
getTournamentSize()
{ return this.tournamentSize; }
public int
getWinnerCount()
{ return this.winnerCount; }
public double
getMutationChance()
{ return this.mutationChance; }
public crossoverType getCrossoverType() { return this.crossover; }
public FitnessInterface getFitness() { return fitnessFxn; }
public void setElitism(boolean elitism) {
this.elitism = elitism;
}
public void setGenerationSize(int size) {
this.genSize = size;
}
public void setTournamentSize(int size) {
this.tournamentSize = size;
}
public void setWinnerCount(int count) {
this.winnerCount = count;
}
public void setMutationChance(double chance) {
this.mutationChance = chance;
}
public void setCrossoverType(crossoverType type) {
this.crossover = type;
}
public void setFitness(FitnessInterface f) {
this.fitnessFxn = f;
}
// </editor-fold>
}
protected Params params;
public static enum crossoverType {
ONE_POINT,
TWO_POINT
}
private Individual[] pool;
private int[] selected;
protected Individual g;
protected Individual p;
//
//
//
//
holds individuals in the generation
holds index of selected children for next gen
global best
best of previous generation
// Constructor
public GeneticAlgorithm(Params params, int maxIterations, double[] acceptableError)
throws DataFormatException {
super(maxIterations, acceptableError);
this.params = params;
if(params.tournamentSize < 2) {
73
throw new DataFormatException("Tournament size must be >= 2");
}
if((params.genSize % params.winnerCount) != 0) {
throw new DataFormatException("Winner count must divide into generation
size");
}
}
// train network against dataset
// Each particle defines a list of weights for the network
// Particles are tested against all points in the dataset, then moved
// to a new set of weights.
// For fairness, iterations are incremented each time a single dataset pair tested
public boolean train(NeuralNetwork n, DataSet d)
throws IncompatibleDataException, NeuralNetworkException {
VerifyInputs(n, d);
iterations = 0;
boolean converged = false;
double weightMin = n.getWeightMin();
// min possible value for weight
double weightMax = n.getWeightMax();
// max possible value for weight
int weightCount = n.getNumWeights();
// includes weights for bias node
this.pool = new Individual[params.genSize];
this.selected = new int[params.genSize];
int tmp_g=0;
and setting this.g after loop
int prev_i = 0;
Object prev_fit;
// reduces # of clones by point to index
// generate initial population
for(int i=0; i < params.genSize; i++) {
iterations += d.size();
Individual ind = new Individual(weightCount, weightMin, weightMax);
// set first particle to existing network weights
// these will either be random, or the last set weights
if(i == 0) {
int cnt = -1;
for(double weight: n.getWeights()) {
ind.setPosition(++cnt, weight);
}
}
ind.fitness = params.fitnessFxn.getFitness(ind.position, n, d,
acceptableError);
this.pool[i] = ind;
// find first global best
if(i ==0 || params.fitnessFxn.compare(ind.fitness, pool[tmp_g].fitness)) {
tmp_g = i;
// check for convergence
if(params.fitnessFxn.hasConverged(pool[tmp_g].fitness)) {
converged = true;
break;
}
}
}
this.g = pool[tmp_g].clone();
prev_i = tmp_g;
prev_fit = this.g.fitness;
// setting best of current generation
74
// train!
while(iterations < maxIterations && !converged) {
iterations += params.genSize * d.size();
selection();
for(int i=0; i < params.genSize; i=i+2) {
switch(params.crossover) {
case ONE_POINT:
crossover_onepoint(selected[i], selected[i+1], i, i+1);
break;
case TWO_POINT:
crossover_twopoint(selected[i], selected[i+1], i, i+1);
break;
}
}
mutation();
// find best fit for previous generation
for(int i=0; i < params.genSize; i++) {
pool[i].fitness = params.fitnessFxn.getFitness(pool[i].position, n, d,
acceptableError);
// set initial prev_fit
if( i==0 ) {
prev_fit = pool[0].fitness;
prev_i = 0;
}
// check for best fitness of generation
if(params.fitnessFxn.compare(pool[i].fitness, prev_fit)) {
prev_fit = pool[i].fitness;
prev_i = i;
}
}
if(params.elitism) {
// copy global best string to 0th position
pool[0].copy(this.g);
}
// check for new global best
if(params.fitnessFxn.compare(prev_fit, this.g.fitness)) {
this.g.copy(pool[prev_i]);
// check for convergence
if(params.fitnessFxn.hasConverged(this.g.fitness)) {
converged = true;
n.setWeights(this.g.position); // asign converged weights to network
}
}
ReportProgress();
}
cleanNetwork(n);
return converged;
}
// Clean up training data
75
// chooses individual from generation using tournament selection
// genSize/2 tournaments are run
// for each tournament, tournament_size individuals are choosen
// the best 2 move on to next generation
private void selection() {
Random rnd = NNRandom.getRandom();
int[] contenders = new int[params.tournamentSize];
// generate tournaments
for(int i=0; i < params.genSize-params.winnerCount; i+=params.winnerCount) {
// choose contenders
for(int j=0; j < params.tournamentSize; j++) {
contenders[j] = rnd.nextInt(params.genSize-1);
}
// fight!
Arrays.sort(contenders);
System.arraycopy(contenders, 0, selected, i, params.winnerCount);
}
}
// chooses random crossover point(s) using CrossoverChance
private void crossover_onepoint(int parent1, int parent2, int child1, int child2) {
Random rnd = NNRandom.getRandom();
int chrom_len = this.pool[0].size;
int site = rnd.nextInt(chrom_len);
Individual ind1 = this.pool[child1].clone();
Individual ind2 = this.pool[child2].clone();
if(site != 0) {
for(int i=site+1; i < chrom_len; i++) {
ind1.position[i] = this.pool[parent2].position[i];
ind2.position[i] = this.pool[parent1].position[i];
}
}
this.pool[child1] = ind1;
this.pool[child2] = ind2;
}
// chooses random crossover point(s) using CrossoverChance
private void crossover_twopoint(int parent1, int parent2, int child1, int child2) {
Random rnd = NNRandom.getRandom();
int chrom_len = this.pool[0].size;
int site1 = rnd.nextInt(chrom_len);
int site2 = rnd.nextInt(chrom_len-site1)+site1; // make sure site2 in [site1,
chrom_len]
Individual ind1 = this.pool[child1].clone();
Individual ind2 = this.pool[child2].clone();
for(int i=site1; i <= site2; i++) {
ind1.position[i] = this.pool[parent2].position[i];
ind2.position[i] = this.pool[parent1].position[i];
}
this.pool[child1] = ind1;
this.pool[child2] = ind2;
}
// returns True if bit should be flipped
private boolean flip() {
return NNRandom.getRandom().nextDouble() < params.mutationChance;
}
private void mutation() {
76
for(int i=0; i < params.genSize; i++) {
for(int j=0; j < this.pool[0].size; j++) {
// converting double into long (so mutation can be done on bits)
long bits = Double.doubleToLongBits(this.pool[i].position[j]);
for(int k=0; k < Long.SIZE; k++) {
if( flip() ) {
bits ^= 1 << k;
}
}
this.pool[i].position[j] = Double.longBitsToDouble(bits);
}
}
}
/**
* Defines individual to be used in Genetic Algorithm.
*/
private class Individual {
public Double[] position;
public Object fitness;
protected int size;
protected double pmin;
protected double pmax;
// fitness of individual
// create new particle and fill with data from p
Individual(Individual p) /*throws IncompatibleDataException*/ {
reset(p.size);
copy(p);
}
Individual(int size, double pmin, double pmax) {
reset(size);
this.pmin = pmin;
this.pmax = pmax;
mutation(1, pmin, pmax);
}
private void reset(int size) {
//create initial arrays
this.size = size;
position = new Double[size];
fitness = Integer.MAX_VALUE;
}
// debug only
protected void printPosition() {
if(DEBUG) {
for(int i=0; i < this.size; i++) {
System.out.print(position[i] + " ");
}
System.out.print("\n");
System.out.flush();
}
}
@Override
protected Individual clone() {
return new Individual(this);
}
77
// copies particle elements
private void copy(Individual p) {
for(int i=0; i < this.size; i++) {
this.position[i] = p.position[i];
this.fitness = p.fitness;
}
}
// set position array value, making sure it falls within bounds
public void setPosition(int index, double value) {
this.position[index] = value;
}
public void mutation(double mutationProbability) {
mutation(mutationProbability, this.pmin, this.pmax);
}
// mutates individual weights using Mutation probability
private void mutation(double mutationProbability, double min, double max) {
Random r = NNRandom.getRandom();
double range = max - min;
for(int i=0; i < this.size; i++) {
if(r.nextDouble() < mutationProbability) {
this.position[i] = r.nextDouble() * range + min;
}
}
}
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.NeighborAnnealing.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer;
import java.util.Random;
import edu.csus.ecs.ssnn.nn.*;
import edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface;
/**
*
* Neighbor Annealing to replace backpropogation for training NN.
*
* A point in the search space consists of one set of data points that contain
* all the weights for the network.
*
* Neighbor annealing generates a new point no farther than DELTA away
* from the current point and compares fitness.
* If new point has better fitness, moves to the new point and repeats.
* Each iteration, DELTA is decreased until endpoint is reached.
*
*/
public class NeighborAnnealing extends TrainingAlgorithm {
public static class Params {
protected double startDelta;
protected double endDelta;
78
protected double stepSize;
// amount to adjust delta by each iteration
protected FitnessInterface fitnessFxn; // chosen fitness function
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
public
public
public
public
double getStartDelta() { return
double getEndDelta()
{ return
double getStepSize()
{ return
FitnessInterface getFitness() {
this.startDelta; }
this.endDelta; }
this.stepSize; }
return fitnessFxn; }
public void setStartDelta(double startDelta) {
this.startDelta = startDelta;
}
public void setEndDelta(double endDelta) {
this.endDelta = endDelta;
}
public void setStepSize(double stepSize) {
this.stepSize = stepSize;
}
public void setFitness(FitnessInterface f) {
this.fitnessFxn = f;
}
// </editor-fold>
}
protected Params params;
// Constructor
public NeighborAnnealing(Params params, int maxIterations, double[] acceptableError)
{
super(maxIterations, acceptableError);
this.params = params;
}
// train network against dataset
// For fairness, iterations are incremented each time a single dataset pair tested
public boolean train(NeuralNetwork n, DataSet d)
throws IncompatibleDataException, NeuralNetworkException {
VerifyInputs(n, d);
iterations = 0;
boolean converged = false;
double delta = params.startDelta;
int weightCount = n.getNumWeights();
Double[] next;
Object next_fitness;
// includes weights for bias node
// holds neighbor position
// holds neighbor fitness
// create initial weights (defaulting to current network weights)
// which are either random or last network weights
// get initial fitness and check for convergence
Double[] curr = new Double[weightCount];
int i = -1;
for(double weight: n.getWeights()) {
curr[++i] = weight;
}
Object curr_fitness = params.fitnessFxn.getFitness(curr, n, d, acceptableError);
iterations += d.size();
// account for initial fitness calc
if(params.fitnessFxn.hasConverged(curr_fitness)) {
converged = true;
}
79
// calculate weight range for scaling deltas
double weight_range = n.getWeightMax() - n.getWeightMin();
// generate and check neighbors
while(delta > params.endDelta && iterations < maxIterations && !converged) {
iterations += d.size();
// scale diff from max start delta of [0, 1] out to domain [weightMin,
weightMax]
double scaled_delta = delta * weight_range + n.getWeightMin();
next = neighbor(curr, scaled_delta);
next_fitness = params.fitnessFxn.getFitness(next, n, d, acceptableError);
if(params.fitnessFxn.compare(next_fitness, curr_fitness)) {
curr = next;
curr_fitness = next_fitness;
// check for convergence
if(params.fitnessFxn.hasConverged(curr_fitness)) {
converged = true;
}
}
// adjust delta
delta -= params.stepSize;
ReportProgress();
}
cleanNetwork(n);
// Clean up training data
// return success/failure
return converged;
}
protected static Double[] neighbor(Double[] pos, double delta) {
Double[] new_pos = new Double[pos.length];
Random r = NNRandom.getRandom();
double diff;
for(int i=0; i < pos.length; i++) {
// diff in [-delta, delta]
diff = r.nextDouble() * 2 * delta - (delta/2.0);
new_pos[i] = pos[i] + diff;
}
return new_pos;
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.ParticleSwarmOptimization.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer;
import java.util.Random;
import edu.csus.ecs.ssnn.nn.*;
import edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface;
80
/**
*
* Particle Swarm Optimization to replace backpropogation for training NN.
*
* A particle consists of one set of data points that contain all the weights
* for the network.
*
* Fitness function is determined by setting weights from particle values and
* running through the training data comparing outputs.
*
1st classification is # of entries outside of acceptable.
*
2nd classification is average difference from output, for ACCEPTABLE results.
*/
public class ParticleSwarmOptimization extends TrainingAlgorithm {
public static
protected
protected
protected
protected
protected
class Params {
int swarmSize;
double c0;
// randomization constant (velocity)
double c1;
// randomization constant (personal)
double c2;
// randomization constant (global)
FitnessInterface fitnessFxn; // chosen fitness function
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
public
public
public
public
public
int
getSwarmSize() { return this.swarmSize; }
double
getC0()
{ return this.c0;
}
double
getC1()
{ return this.c1;
}
double
getC2()
{ return this.c2;
}
FitnessInterface getFitness() { return fitnessFxn; }
public void setSwarmSize(int swarmSize) {
this.swarmSize = swarmSize;
}
public void setC0(double c0) {
this.c0 = c0;
}
public void setC1(double c1) {
this.c1 = c1;
}
public void setC2(double c2) {
this.c2 = c2;
}
public void setFitness(FitnessInterface f) {
this.fitnessFxn = f;
}
// </editor-fold>
}
protected Params params;
private Particle[] swarm;
protected int g;
// holds individual particles in the swarm
// global best
// Constructor
public ParticleSwarmOptimization(Params params, int maxIterations, double[]
acceptableError) {
super(maxIterations, acceptableError);
this.params = params;
}
81
// train network against dataset
// Each particle defines a list of weights for the network
// Particles are tested against all points in the dataset, then moved
// to a new set of weights.
// For fairness, iterations are incremented each time a single dataset pair tested
public boolean train(NeuralNetwork n, DataSet d)
throws IncompatibleDataException, NeuralNetworkException {
VerifyInputs(n, d);
iterations = 0;
boolean converged = false;
double weightMin = n.getWeightMin();
double weightMax = n.getWeightMax();
int weightCount = n.getNumWeights();
Object fitness;
// reset iteration count
// min possible value for weight
// max possible value for weight
// includes weights for bias node
// Initialize Particles and randomize values
this.g = 0;
this.swarm = new Particle[params.swarmSize];
for(int i=0; i < params.swarmSize; i++) {
// each particle scans through entire dataset
// have to increase iterations in init phase because we increase
// chance to find convergent solution during init phase by increasing
// swarmSize. Otherwise, a sufficiently large swarmSize would likely
// return convergent value "for free" if we're not counting iterations.
iterations += d.size();
Particle p = new Particle(weightCount, weightMin, weightMax);
// set first particle to existing network weights
// these will either be random, or the last set weights
if(i == 0) {
int cnt = -1;
for(double weight: n.getWeights()) {
p.setPosition(++cnt, weight);
}
}
p.best_fitness = params.fitnessFxn.getFitness(p.position, n, d,
acceptableError);
this.swarm[i] = p;
// find initial global best
if(i == 0 || params.fitnessFxn.compare(p.best_fitness,
swarm[g].best_fitness)) {
g = i;
// check for convergence
if(params.fitnessFxn.hasConverged(swarm[g].best_fitness)) {
converged = true;
break;
}
}
}
// train!
while(iterations < maxIterations && !converged) {
for(int i=0; i < params.swarmSize; i++) {
// move particle
update_velocity(swarm[i]);
update_position(swarm[i]);
82
fitness = params.fitnessFxn.getFitness(swarm[i].position, n, d,
acceptableError);
// each particle scans through entire dataset
iterations += d.size();
// compare against particles pBest, set new pBest as appropriate
if(params.fitnessFxn.compare(fitness, swarm[i].best_fitness) ) {
swarm[i].setBest();
swarm[i].best_fitness = fitness;
// check any new pbests to see if they are gbest also
if(params.fitnessFxn.compare(fitness, swarm[g].best_fitness)) {
if(DEBUG){
System.out.println("Found new gbest. " +
"Was (" + g + "," + swarm[g].best_fitness + "). " +
"Now (" + i + "," + swarm[i].best_fitness + ").");
}
g = i;
// check for convergence
if(params.fitnessFxn.hasConverged(swarm[g].best_fitness)) {
converged = true;
break;
//break out to while loop
}
}
}
}
ReportProgress();
}
cleanNetwork(n);
// Clean up training data
return converged;
}
private void update_velocity(Particle p) {
Random r = NNRandom.getRandom();
double rand_c0 = params.c0 * r.nextDouble();
double rand_c1 = params.c1 * r.nextDouble();
double rand_c2 = params.c2 * r.nextDouble();
// each dimension of array needs to be calculated separately
for(int i=0; i < p.size; i++) {
//p.velocity[x] = c0*p.velocity[x] + c1*rnd * (p.best[x] - p.position[x]) +
c2*rnd * (g.best[x] - p.position[x])
double pbest_delta = rand_c1 * (p.best[i] - p.position[i]);
double gbest_delta = rand_c2 * (swarm[g].best[i] - p.position[i]);
p.setVelocity(i, rand_c0*p.velocity[i] + pbest_delta + gbest_delta);
}
}
private void update_position(Particle p) {
for(int i=0; i < p.size; i++) {
p.setPosition(i, p.position[i] + p.velocity[i]);
}
}
/**
83
* Defines particle to be use for Particle Swarm Optimization.
*/
private class Particle {
public Double[] position;
public Double[] velocity;
public Double[] best;
public Object best_fitness;
protected double pmin;
//
protected double pmax;
//
protected double vmax;
//
protected int size;
// fitness of pbest
minimum weight value
maximum weight value
maximum velocity change.
from pmin/max
// create new particle and fill with data from p
Particle(Particle p) throws IncompatibleDataException {
reset(p.size);
copy(p);
}
Particle(int size, double pmin, double pmax) {
reset(size);
this.pmin = pmin;
this.pmax = pmax;
this.vmax = this.pmax - this.pmin;
randomize(position, this.pmin, this.pmax);
randomize(velocity, -this.vmax, this.vmax);
setBest();
}
private void reset(int size) {
//create initial arrays
this.size = size;
position = new Double[size];
velocity = new Double[size];
best = new Double[size];
}
// debug only
protected void printPosition() {
if(DEBUG) {
for(int i=0; i < this.size; i++) {
System.out.print(position[i] + " ");
}
System.out.print("\n");
System.out.flush();
}
}
// debug only
protected void printBest() {
if(DEBUG) {
for(int i=0; i < this.size; i++) {
System.out.print(best[i] + " ");
}
System.out.print("\n");
System.out.flush();
}
}
// copies particle elements
private void copy(Particle p) throws IncompatibleDataException {
84
if(p.size != this.size) {
throw new IncompatibleDataException("Particles are of different size.
Unable to copy.");
}
for(int i=0; i < this.size; i++) {
this.position[i] = p.position[i];
this.velocity[i] = p.velocity[i];
this.best[i] = p.best[i];
}
}
// set new best array
public final void setBest() {
System.arraycopy(this.position, 0, this.best, 0, this.size);
}
// randomize elements of array, bounding each new random entry
private void randomize(Double[] l, double min, double max) {
Random r = NNRandom.getRandom();
double range = (max - min);
for(int i=0; i < this.size; i++) {
//set to random value within appropriate range
l[i] = r.nextDouble() * range + min;
}
}
// set velocity array value, making sure it falls within bounds
public void setVelocity(int index, double value) {
if(value > this.vmax) {
value = this.vmax;
} else if(value < -this.vmax) {
value = -this.vmax;
}
this.velocity[index] = value;
}
// set position array value, making sure it falls within bounds
public void setPosition(int index, double value) {
if(value > this.pmax) {
value = this.pmax;
} else if(value < this.pmin) {
value = this.pmin;
}
this.position[index] = value;
}
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithm.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer;
import java.util.ArrayList;
import edu.csus.ecs.ssnn.nn.*;
import edu.csus.ecs.ssnn.event.TrainingProgressEvent;
import edu.csus.ecs.ssnn.event.TrainingProgressEventListenerInterface;
85
/**
* Abstract training algorithm that provides several helper functions for
* training algorithms.
*/
public abstract class TrainingAlgorithm implements TrainingAlgorithmInterface {
/**
* can be enabled before compilation to print out extra debug statements
* (if set to false, debug statement should be left out of compilation because flag
is final)
*/
public static final boolean DEBUG=false;
/**
* Provides functionality to pass a list of parameters specific to a training
algorithm.
* Individual training algorithms overwrite this class to provide their own
parameters.
* Passing parameters in this way reduces the number of places in the code that must
be
* updated when adding new algorithms.
*/
public abstract static class Params {
};
/**
* Correctness criteria
*/
protected double[] acceptableError;
/**
* Number of training iterations to attempt
*/
protected int maxIterations;
/**
* Interval in iterations between TrainingProgressEvents
*/
protected int reportInterval;
/**
* Current number of iterations attempted
*/
protected int iterations;
protected ArrayList<TrainingProgressEventListenerInterface> listeners;
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
/**
*
* @return correctness criteria
*/
public double[] getAcceptableError() {
return this.acceptableError;
}
/**
*
* @return current training iterations
*/
public int getIterations() {
86
return this.iterations;
}
/**
*
* @return total allowed training iterations
*/
public int getMaxIterations() {
return this.maxIterations;
}
/**
*
* @return interval (in interations) between TrainingProgressEvents
*/
public int getReportInterval() {
return this.reportInterval;
}
/**
*
* @param acceptableError
*
correctness criteria
*/
public void setAcceptableError(double[] acceptableError) {
this.acceptableError = acceptableError;
}
/**
*
* @param maxIterations
*
total allowed training iterations
*/
public void setMaxIterations(int maxIterations) {
this.maxIterations = maxIterations;
}
/**
*
* @param reportInterval
*
interval (in iterations) between TrainingProgressEvents
*/
public void setReportInterval(int reportInterval) {
this.reportInterval = reportInterval;
}
// </editor-fold>
/**
* Constructor. Sets variables required for all training algorithms.
*
* @param maxIterations
*
Number of training iterations to attempt before stopping
* @param acceptableError
*
Correctness criteria for each dimension of output
*/
public TrainingAlgorithm(int maxIterations, double[] acceptableError) {
this.maxIterations = maxIterations;
this.acceptableError = acceptableError;
listeners = new ArrayList<TrainingProgressEventListenerInterface>();
iterations = 0;
reportInterval = 1000;
}
87
/**
* Does basic validation of network and dataset to ensure compatibility.
*
* @param n
*
Network to verify
* @param d
*
Dataset to verify
* @throws IncompatibleDataException
*
If dataset dimensions don't match network dimensions
* @throws NeuralNetworkException
*
If dataset is empty
*/
protected static void VerifyInputs(NeuralNetwork n, DataSet d)
throws IncompatibleDataException, NeuralNetworkException {
// Ensure data set has same # of inputs and outputs as the network
if (n.getNumInputs() != d.getNumInputs()
|| n.getLayer(n.getLayerCount() - 1).getNodeCount() != d.getNumOutputs()) {
throw new IncompatibleDataException(
"Dataset has different number of inputs and outputs than the
network.\n" +
"Dataset: " + d.getNumInputs() + ", " + d.getNumOutputs() + "\n" +
"Network: " + n.getNumInputs() + ", " + n.getLayer(n.getLayerCount()
- 1).getNodeCount());
}
if (d.size() == 0) {
throw new NeuralNetworkException("Dataset is empty");
}
}
/**
* Removes training data from nodes of the network
*
* @param n
*
Neural Network to clean
*/
protected static void cleanNetwork(NeuralNetwork n) {
for (NetworkLayer l : n) {
for (NetworkNode nn : l) {
nn.setTrainingData(null);
}
}
}
/**
* Updates training progress listener with current status
* at regular intervals.
*/
protected void ReportProgress() {
if (iterations % reportInterval == 0) {
double percentComplete = ((double)iterations/(double)maxIterations) * 100.0;
TrainingProgressEvent e = new TrainingProgressEvent(this, iterations,
percentComplete);
for (TrainingProgressEventListenerInterface l : listeners) {
l.trainingProgress(e);
}
Thread.yield();
}
}
88
/**
* Loads array of weights into the neural network
*
* @param weights
*
Array of weights (must match network node weight count)
* @param n
*
Network to set weights on
*/
protected static void setWeights(Double[] weights, NeuralNetwork n) {
n.setWeights(weights);
}
// <editor-fold defaultstate="collapsed" desc="Listener Methods">
public void addTrainingEventListener(TrainingProgressEventListenerInterface listener)
{
listeners.add(listener);
}
public void removeTrainingEventListener(TrainingProgressEventListenerInterface
listener) {
listeners.remove(listener);
}
public void clearListeners() {
listeners.clear();
}
// </editor-fold>
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer;
import edu.csus.ecs.ssnn.nn.*;
import edu.csus.ecs.ssnn.event.TrainingProgressEventListenerInterface;
/**
* Interface for all training algorithms.
*/
public interface TrainingAlgorithmInterface {
/**
* Trains the network.
*
* @param n
*
Neural network to train.
* @param d
*
Training set
* @return True if network converged, false otherwise.
* @throws NeuralNetworkException
*/
public boolean train(NeuralNetwork n, DataSet d) throws NeuralNetworkException;
/**
*
* @return Number of iterations taken.
*/
public int getIterations();
/**
89
*
* @return Maximum number of iterations allowed.
*/
public int getMaxIterations();
/**
*
* @return Frequency (in iterations) to send training events.
*/
public int getReportInterval();
/**
*
* @param maxIterations
*
Maximum number of iterations allowed.
*/
public void setMaxIterations(int maxIterations);
/**
*
* @param reportInterval
*
Frequency (in iterations) to send training events.
*/
public void setReportInterval(int reportInterval);
public void addTrainingEventListener(TrainingProgressEventListenerInterface
listener);
public void removeTrainingEventListener(TrainingProgressEventListenerInterface
listener);
public void clearListeners();
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer.fitness;
import edu.csus.ecs.ssnn.nn.DataSet;
import edu.csus.ecs.ssnn.nn.NeuralNetwork;
import edu.csus.ecs.ssnn.nn.NeuralNetworkException;
/**
* Interface to provide fitness function functions :).
*/
public interface FitnessInterface {
/**
* Calculates and returns fitness of network against training set.
* Returns as an object to allow for different fitness functions to return
* different data types.
*
* @param weights
*
Weights to set on neural network.
* @param n
*
Network to use for calculating fitness.
* @param trainingData
*
Training set to calculate fitness against.
* @param acceptableError
*
Acceptable error for outputs to differ from expected.
*
* @return
Object containing fitness
* @throws NeuralNetworkException
*/
public Object getFitness(Double[] weights, NeuralNetwork n, DataSet trainingData,
double[] acceptableError) throws NeuralNetworkException;
90
/**
* Checks passed fitness value for convergence.
*
* @param fitness
*
Calculated fitness value
* @return Nothing
*/
public boolean hasConverged(Object fitness);
/**
* Compares 2 fitness values to determine if new is better than original
*
* @param orig_fitness
*
Original calculated fitness value
*
* @param new_fitness
*
New calculated fitness value
* @return Nothing
*/
public boolean compare(Object new_fitness, Object orig_fitness);
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.fitness.LargestChunk.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer.fitness;
import java.util.List;
import
import
import
import
edu.csus.ecs.ssnn.nn.DataSet;
edu.csus.ecs.ssnn.nn.NeuralNetwork;
edu.csus.ecs.ssnn.nn.NeuralNetworkException;
edu.csus.ecs.ssnn.nn.DataPair;
/**
* Fitness is determined by largest chunk size of contiguously solved points.
* Larger fitness is better.
*/
public class LargestChunk implements FitnessInterface {
int totalSize;
/**
* Constructor.
*/
public LargestChunk() {
this.totalSize = 0;
}
/**
* Fitness is determined by largest chunk szie of contiguously solved points.
*
* @param weights
*
Weights to set on neural network.
* @param n
*
Network to use for calculating fitness.
* @param trainingData
*
Training set to calculate fitness against.
91
* @param acceptableError
*
Acceptable error for outputs to differ from expected.
*
* @return
Object containing fitness
* @throws NeuralNetworkException
*/
public Object getFitness(Double[] weights, NeuralNetwork n, DataSet trainingData,
double[] acceptableError) throws NeuralNetworkException {
boolean correct;
int maxChunkSize = 0;
int curChunkSize = 0;
n.setWeights(weights);
this.totalSize = trainingData.size();
double[] errors = new double[n.getNumOutputs()];
for(DataPair dp: trainingData) {
// run training data
List<Double> inputs = dp.getInputs();
List<Double> outs = n.getOutputs(inputs);
// Compute output errors on each dimension
for (int i = 0; i < n.getNumOutputs(); i++) {
// Error = desired output - actual output
errors[i] = dp.getOutput(i) - outs.get(i);
}
// See if all outputs are within acceptable error
correct = true;
for (int i = 0; i < errors.length; i++) {
if (Math.abs(errors[i]) > acceptableError[i]) {
correct = false;
break;
}
}
// set chunks
if(correct) {
curChunkSize++;
}
else {
if( curChunkSize > maxChunkSize ) {
maxChunkSize = curChunkSize;
curChunkSize = 0;
}
}
}
// final check in case largest chunk is at the end
if( curChunkSize > maxChunkSize ) {
maxChunkSize = curChunkSize;
}
return new Integer(maxChunkSize);
}
public boolean hasConverged(Object fitness) {
// if chunk covers entire data set
return((Integer) fitness == this.totalSize);
}
92
public boolean compare(Object new_fitness, Object orig_fitness) {
// new chunk is larger than old chunk
return ( (Integer) new_fitness > (Integer) orig_fitness);
}
}
/* ===================================================================
edu.csus.ecs.ssnn.nn.trainer.fitness.SimpleUnsolvedPoints.java
=================================================================== */
package edu.csus.ecs.ssnn.nn.trainer.fitness;
import java.util.List;
import
import
import
import
edu.csus.ecs.ssnn.nn.DataSet;
edu.csus.ecs.ssnn.nn.NeuralNetwork;
edu.csus.ecs.ssnn.nn.NeuralNetworkException;
edu.csus.ecs.ssnn.nn.DataPair;
/**
* Calculates fitness by number of unsolved points. Lower fitness is better.
*/
public class SimpleUnsolvedPoints implements FitnessInterface {
/**
* Calculates fitness by number of unsolved points.
*
* @param weights
*
Weights to set on neural network.
* @param n
*
Network to use for calculating fitness.
* @param trainingData
*
Training set to calculate fitness against.
* @param acceptableError
*
Acceptable error for outputs to differ from expected.
*
* @return
Object containing fitness
* @throws NeuralNetworkException
*/
public Object getFitness(Double[] weights, NeuralNetwork n, DataSet trainingData,
double[] acceptableError) throws NeuralNetworkException {
int incorrectOutput = 0;
n.setWeights(weights);
for(DataPair dp: trainingData) {
// run training data
List<Double> inputs = dp.getInputs();
List<Double> outs = n.getOutputs(inputs);
// Compute output errors on each dimension
double[] errors = new double[n.getNumOutputs()];
for (int i = 0; i < n.getNumOutputs(); i++) {
// Error = desired output - actual output
errors[i] = dp.getOutput(i) - outs.get(i);
}
// See if all outputs are within acceptable error
for (int i = 0; i < errors.length; i++) {
if (Math.abs(errors[i]) > acceptableError[i]) {
incorrectOutput++;
// using # of miscalculations
break;
}
93
}
}
return new Integer(incorrectOutput);
}
public boolean hasConverged(Object fitness) {
return((Integer) fitness == 0);
}
public boolean compare(Object new_fitness, Object orig_fitness) {
return ( (Integer) new_fitness < (Integer) orig_fitness);
}
}
/* ===================================================================
edu.csus.ecs.ssnn.splittrainer.AreaBasedBinarySplitTrainer.java
=================================================================== */
package edu.csus.ecs.ssnn.splittrainer;
import edu.csus.ecs.ssnn.nn.*;
import java.util.ArrayList;
import java.util.List;
public class AreaBasedBinarySplitTrainer extends SSNNTrainingAlgorithm {
/*
* Used to pass region / randomize values when queuing networks
*/
protected class QueueDataRegion {
public DataRegion region;
public boolean saveWeights;
public QueueDataRegion(DataRegion region, boolean saveWeights) {
this.region = region;
this.saveWeights = saveWeights;
}
}
public AreaBasedBinarySplitTrainer() {
super();
}
/* Area Based Binary Splitting algorithm
* Find largest solved chunk and split network as follows:
* [smaller unsolved edge + solved] [larger unsolved]
* Currently, smaller is defined as "less area"
* rather than less points contained within region
* if a sufficiently large chunk cannot be found does a centroid split.
*
* network that includes the solved chunk must initialize with last weights.
*/
@Override
protected SplitType splitNetwork(NeuralNetwork n, DataRegion r, DataSet d) throws
NeuralNetworkException {
Chunk best = findBestChunk(n, d);
94
// Now determine what kind of split to do...
if (best.length >= minChunkSize) {
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(best.dimension);
double domainMin = sortedData.getPair(0).getInput(best.dimension);
double domainMax = sortedData.getPair(sortedData.size() 1).getInput(best.dimension);
// Somehow our chunk covers the entire domain
/*
* It's actually possible for this to happen, although it is
* extremely rare. If the training algorithm does not converge, but
* its very last adjustment to the network made it perfect, this
* case will occur.
*/
if (best.start == domainMin && best.end == domainMax) {
return SplitType.unnecessary;
}
// Are we on the lower or upper edge?
else if (best.start == domainMin || best.end == domainMax) {
// Make sure there are at least two distinct values in the chunk
if (countDistinctValuesInChunk(best.start, best.end, sortedData,
best.dimension) < 2) {
// Can't split (too few values) - try centroid
return centroidSplit(n, r, d);
} else {
// split based on min/max
List<DataRegion> splitRegions;
ArrayList<Double> splitPoints = new ArrayList<Double>();
ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>();
ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>();
// lower half is solved
if(best.start == domainMin) {
splitRegions = r.split(best.dimension, best.end);
unsolvedRegions.add(splitRegions.get(1));
solvedRegions.add(splitRegions.get(0));
splitPoints.add(best.end);
//upper half is solved
} else {
splitRegions = r.split(best.dimension, best.start);
unsolvedRegions.add(splitRegions.get(0));
solvedRegions.add(splitRegions.get(1));
splitPoints.add(best.start);
}
// Remove existing network
trainingMNetwork.removeNetwork(n);
// Re-add solved region
queueSolvedNetwork(n, solvedRegions);
// create new network for the unsolved region
queueUnsolvedNetworks(n, unsolvedRegions);
// Create and dispatch split event
HandleSplit(SplitType.chunk,
unsolvedRegions, solvedRegions, splitPoints,
best.length, best.dimension);
return SplitType.chunk;
95
}
}
// We're in the middle of the data set
else {
// Need at least three distinct values in the chunk
if (countDistinctValuesInChunk(best.start, best.end, sortedData,
best.dimension) < 3) {
// Can't split - try centroid
return centroidSplit(n, r, d);
} else {
// split data into [smaller unsolved + solved] [unsolved]
// first find [unsolved] [solved] [unsolved]
// then determine which [unsolved] is smaller
List<DataRegion> firstSplit = r.split(best.dimension, best.start);
List<DataRegion> secondSplit =
firstSplit.get(1).split(best.dimension, best.end);
ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>();
ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>();
List<DataRegion> finalSplit;
// lower unsolved is smaller, so include with solved network
if(firstSplit.get(0).getArea() < secondSplit.get(1).getArea()) {
best.start = r.getMin(best.dimension);
finalSplit = r.split(best.dimension, best.end);
unsolvedRegions.add(finalSplit.get(1));
unsolvedRegions.add(finalSplit.get(0)); // save weights
} else {
// upper unsolved is smaller, so include with solved
network
best.end = r.getMax(best.dimension);
finalSplit = r.split(best.dimension, best.start);
unsolvedRegions.add(finalSplit.get(1));
unsolvedRegions.add(finalSplit.get(0)); // save weights
}
ArrayList<Double> splitPoints = new ArrayList<Double>();
splitPoints.add(best.start);
splitPoints.add(best.end);
// Remove existing network
trainingMNetwork.removeNetwork(n);
// create new networks for the unsolved upper / lower regions
List<QueueDataRegion> queueUnsolved = new
ArrayList<QueueDataRegion>();
queueUnsolved.add(new QueueDataRegion(unsolvedRegions.get(0),
false));
queueUnsolved.add(new QueueDataRegion(unsolvedRegions.get(1), true));
queueUnsolvedNetworks_SaveWeights(n, queueUnsolved);
// Create and dispatch split event
HandleSplit(SplitType.chunk,
unsolvedRegions, solvedRegions, splitPoints,
best.length, best.dimension);
return SplitType.chunk;
}
}
// No chunk or chunk too small
96
} else {
sizeFailures++;
return centroidSplit(n, r, d);
}
}
/*
* Splits Dataset across 2 neural networks with no solved regions.
*/
private SplitType centroidSplit(NeuralNetwork n, DataRegion r, DataSet d) throws
NeuralNetworkException {
// To select the dimension to split along, find the one with the most
// distinct values
int bestDimension = 0;
int mostDistinctValues = 0;
for (int inputNum = 0; inputNum < n.getNumInputs(); inputNum++) {
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(inputNum);
int currDistinctValues = countDistinctValuesInChunk(
sortedData.getPair(0).getInput(inputNum),
sortedData.getPair(sortedData.size() - 1).getInput(inputNum),
sortedData,
inputNum);
if (currDistinctValues > mostDistinctValues) {
mostDistinctValues = currDistinctValues;
bestDimension = inputNum;
}
}
if (mostDistinctValues > 2) {
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(bestDimension);
int splitPoint = mostDistinctValues / 2;
// creating solved regions list (empty), unsolved regions, and split points
list
ArrayList<Double> splitPoints = new ArrayList<Double>();
splitPoints.add(sortedData.getPair(splitPoint).getInput(bestDimension));
List<DataRegion> unsolvedRegions = r.split(bestDimension, sortedData
.getPair(splitPoint).getInput(bestDimension));
List<DataRegion> solvedRegions = new ArrayList<DataRegion>();
// removing original network - replaced by new split networks
trainingMNetwork.removeNetwork(n);
// create new networks for the unsolved upper / lower regions
queueUnsolvedNetworks(n, unsolvedRegions);
HandleSplit(SplitType.centroid,
unsolvedRegions, solvedRegions, splitPoints,
0, bestDimension);
return SplitType.centroid;
}
// No way to split
return SplitType.impossible;
}
97
/*
* Providing new queuing method to allow for non-randomized weights
* Rather than pass a list of regions, passing a list of <region, doRandomize> pairs
*/
protected void queueUnsolvedNetworks_SaveWeights(NeuralNetwork nnTopo,
List<QueueDataRegion> regions) throws NeuralNetworkException{
for(QueueDataRegion q: regions) {
DataRegion scalingRegion;
NeuralNetwork n = new NeuralNetwork(nnTopo.getNumInputs(),
nnTopo.getNumOutputs(), nnTopo.getHiddenTopology());
if(!q.saveWeights) {
n.randomizeWeights(0.1);
n.setInputScaling(q.region, trainingMNetwork.getInputRange());
scalingRegion = q.region;
} else {
// clones weights for new network - this is important if future algorithms
// can use the same nnTopo to create multiple networks, as cloning ensures
// they aren't sharing the same underlying Double objects
n.setWeights(nnTopo.getWeights().toArray(new Double[]{}));
scalingRegion = nnTopo.getScalingRegion();
}
n.setInputScaling(scalingRegion, trainingMNetwork.getInputRange());
trainingMNetwork.addNetwork(n, q.region);
networkQueue.add(trainingMNetwork.getNetworkRecord(trainingMNetwork.getNetworkCount()1));
}
}
/*
* Count all distinct values in the dimension that fall between min / max
*/
private int countDistinctValuesInChunk(double minValue, double maxValue,
DataSet sortedData, int chunkDimension) {
int count = 0;
double lastValue = Double.MIN_VALUE;
for (DataPair p : sortedData) {
double curValue = p.getInput(chunkDimension);
if (curValue > maxValue) {
break;
}
if (curValue >= minValue) {
if (curValue != lastValue) {
lastValue = curValue;
count++;
}
}
}
return count;
}
/*
* Finds largest solved chunk across all domains
*/
private Chunk findBestChunk(NeuralNetwork n, DataSet d) throws NeuralNetworkException
{
Chunk current;
Chunk best = new Chunk();
98
// determine scaling slopes for original network
// this will match sorted dataset, as domain min/maxes aren't changed
//List<Double> inputSlopes =
d.getInputScalingSlope(trainingMNetwork.getScalingInputMin(),
trainingMNetwork.getScalingInputMax());
//List<Double> outputSlopes =
d.getOutputScalingSlope(trainingMNetwork.getScalingOutputMin(),
trainingMNetwork.getScalingOutputMax());
// For each input dimension, find largest solved chunk
for (int inputNum = 0; inputNum < d.getNumInputs(); inputNum++) {
// Create a dataset which is sorted on the input dimension
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(inputNum);
current = new Chunk();
double lastFail = 0;
// reset chunk for new dimension
for (int pairNum = 0; pairNum < sortedData.size(); pairNum++) {
DataPair currentPair = sortedData.getPair(pairNum);
// fast-forward through points with same dimensional value if original
failed
// (don't want to split on a dimensional point with a mix of good/bad
values)
if(pairNum > 0 && current.length == 0 && currentPair.getInput(inputNum)
== lastFail) {
continue;
}
List<Double> outputs = n.getOutputs(currentPair.getInputs());
// Check each output for correctness against expected
for (int outputNum = 0; outputNum < outputs.size(); outputNum++) {
// output is incorrect, reset chunk
if (Math.abs(outputs.get(outputNum) currentPair.getOutput(outputNum)) > acceptableErrors[outputNum]) {
// found a new best!
if(current.length > best.length) {
best = current;
}
current = new Chunk();
// reset chunk
}
// output is correct, increment chunk
else {
current.length++;
current.end = currentPair.getInput(inputNum);
if(current.length == 1) {
// this is a new chunk
current.start = current.end;
current.dimension = inputNum;
}
}
}
// extra check to make sure last chunk wasn't best
if(current.length > best.length) {
best = current;
}
}
}
99
// best should now hold best chunk values
return best;
}
}
/* ===================================================================
edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithm.java
=================================================================== */
package edu.csus.ecs.ssnn.splittrainer;
import
import
import
import
import
import
import
import
edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface;
edu.csus.ecs.ssnn.nn.*;
java.util.ArrayList;
java.util.Iterator;
java.util.LinkedList;
java.util.List;
java.util.Map;
java.util.EnumMap;
import
import
import
import
import
import
import
edu.csus.ecs.ssnn.data.TrainingResults;
edu.csus.ecs.ssnn.event.NetworkConvergedEvent;
edu.csus.ecs.ssnn.event.NetworkSplitEvent;
edu.csus.ecs.ssnn.event.NetworkTrainingEvent;
edu.csus.ecs.ssnn.event.TrainingCompletedEvent;
edu.csus.ecs.ssnn.event.SplitTrainerEventListenerInterface;
edu.csus.ecs.ssnn.nn.ModularNeuralNetwork.NetworkRecord;
/**
* Base class to use for SSNN Training algorithms. It provides several helper
* fucnctions to reduce the amount of effort to create a new algorithm.
*/
public abstract class SSNNTrainingAlgorithm implements SSNNTrainingAlgorithmInterface {
/**
* Type of network split that occurred.
*/
protected static enum SplitType {
chunk,
centroid,
impossible,
unnecessary
}
/**
* A chunk is a set of contiguous solved points.
* on where the chunk falls in the dataset.
*/
protected class Chunk {
double start;
double end;
int dimension;
int length;
Chunk() {
this.start = Double.MIN_VALUE;
this.end = Double.MIN_VALUE;
this.dimension = -1;
this.length = 0;
}
}
This class holds information
100
/**
* Contains a list of counts for each split type.
*/
protected Map<SplitType, Integer> splitCounts;
/**
* Number of splits that failed because resulting training set was too small.
*/
protected int sizeFailures;
/**
* Start time of training.
*/
protected long startTime;
Used for calculating how long training networks takes.
/**
* Total area to be solved by the modular neural network.
*/
protected double totalArea;
/**
* Current number of iterations taken to solve the network.
*/
protected int totalIterations;
/**
* The modular neural network containing solved networks and associated regions.
*/
protected ModularNeuralNetwork trainingMNetwork;
/**
* List of networks that still need to be trained.
*/
protected LinkedList<NetworkRecord> networkQueue;
/**
* Event listeners for splitting.
*/
protected ArrayList<SplitTrainerEventListenerInterface> listeners;
// <editor-fold defaultstate="collapsed" desc="Settable properties for training.">
/**
* Algorithm used to train individual neural networks.
*/
protected TrainingAlgorithmInterface nn_alg;
/**
* Correctness criteria for determining if output was within acceptable error.
*/
protected double[] acceptableErrors;
/**
* Flag used to stop training if a network is unsolvable and cannot be split.
*/
protected boolean failOnUnsplittableNetwork;
/**
* Minimum size of training data allowed for any individual network.
*/
protected int minChunkSize;
/**
101
* Training set used to train the neural networks.
*/
protected DataSet trainingData;
// </editor-fold>
// <editor-fold defaultstate="collapsed" desc="Get/Set Properties">
/**
* @return Correctness criteria for determining if output was within acceptable
error.
*/
public double[] getAcceptableErrors() {
return this.acceptableErrors;
}
/**
* @return Minimum size of training data allowed for any individual network.
*/
public int getMinChunkSize() {
return minChunkSize;
}
/**
*
* @return Flag used to stop training if a network is unsolvable and cannot be split.
*/
public boolean isFailOnUnsplittableNetwork() {
return failOnUnsplittableNetwork;
}
/**
*
* @param acceptableErrors
*
Correctness criteria for determining if output was within acceptable
error.
*/
public void setAcceptableErrors(double[] acceptableErrors) {
this.acceptableErrors = acceptableErrors;
}
/**
*
* @param minSize
*
Minimum size of training data allowed for any individual network.
* @throws DataFormatException
*
Thrown if minimum splitting size is less than zero.
*/
public void setMinChunkSize(int minSize) throws DataFormatException {
if(minSize < 1) {
throw new DataFormatException("Minimum splitting must be > 0 (was " + minSize
+ ")");
}
minChunkSize = minSize;
}
/**
*
* @param fail
*
Flag used to stop training if a network is unsolvable and cannot be
split.
*/
public void setFailOnUnsplittableNetwork(boolean fail) {
failOnUnsplittableNetwork = fail;
102
}
// </editor-fold>
/**
* Constructor.
*/
public SSNNTrainingAlgorithm() {
minChunkSize = 1;
failOnUnsplittableNetwork = true;
splitCounts = new EnumMap<SplitType, Integer>(SplitType.class);
listeners = new ArrayList<SplitTrainerEventListenerInterface>();
}
/**
* Defines split algorithm to be used. Must be defined by algorithm class.
*
* @param n
*
Network to split.
* @param r
*
Data region of network.
* @param d
*
Data set of region.
* @return
*
Type of split determined by network.
* @throws NeuralNetworkException
*/
protected abstract SplitType splitNetwork(NeuralNetwork n, DataRegion r, DataSet d)
throws NeuralNetworkException;
/**
* Basic training framework for splitting and training subnetworks.
* Individual splitting algorithms need to define splitNetwork()
*
* @param mn
*
Modular neural network to be trained.
* @param trainingDataSet
*
Complete set of training data used to train the MNN.
* @param a
*
Training algorithm used to train individual neural networks.
* @throws NeuralNetworkException
*/
public boolean train(ModularNeuralNetwork mn, DataSet trainingDataSet,
TrainingAlgorithmInterface a) throws NeuralNetworkException {
boolean nnConverged;
// reset all variables before training
this.trainingData = trainingDataSet;
this.nn_alg = a;
this.trainingMNetwork = mn;
this.totalIterations = 0;
this.startTime = System.currentTimeMillis();
this.totalArea = trainingData.getInputDataRegion().getArea();
this.networkQueue = new LinkedList<NetworkRecord>();
// set all split counts to zero
for(SplitType i : SplitType.values()) {
splitCounts.put(i, 0);
}
// Put all existing neural networks into a queue
Iterator<NetworkRecord> iter = trainingMNetwork.iterator();
while (iter.hasNext()) {
103
networkQueue.add(iter.next());
}
// While there are networks to train
while (networkQueue.size() > 0) {
// Get the network from the end of the queue and its relevant
// region and training data set. Grabbing last network so that
// previous weights can be applied (if necessary)
NetworkRecord currentNetworkRecord = networkQueue.pop();
//.poll();
NeuralNetwork currentNetwork
= currentNetworkRecord.getNeuralNetwork();
DataRegion currentRegion
= currentNetworkRecord.getDataRegion();
DataSet regionSpecificDataSet
=
trainingData.getDataInRegion(currentRegion);
currentNetworkRecord.setTrainingSetSize(regionSpecificDataSet.size());
// Try training the network
StartTraining(currentRegion);
// training with data set
nnConverged = TrainNetwork(currentNetworkRecord, regionSpecificDataSet);
if (!nnConverged) {
// If it doesn't converge, try splitting it.
SplitType splitResult = splitNetwork(currentNetwork, currentRegion,
regionSpecificDataSet);
if (splitResult == SplitType.centroid) {
} else if (splitResult == SplitType.chunk) {
} else if (splitResult == SplitType.unnecessary) {
currentNetworkRecord.setSolved(true);
HandleNetworkConverged(currentRegion, nn_alg.getIterations());
} else if (failOnUnsplittableNetwork) {
removeUntestedNetworks();
HandleTrainingCompleted(false);
return false;
}
} else {
// done with network, converged successfully
HandleNetworkConverged(currentRegion, nn_alg.getIterations());
}
Thread.yield();
}
HandleTrainingCompleted(true);
return true;
}
/**
* Walks through queued networks to determine area of data regions
* left to solve, then computes against totalArea.
*
* @return percentage of total area solved by the networks.
*/
protected final double computeSolvedPercentage() {
double unsolvedArea = 0;
for (NetworkRecord r : networkQueue) {
unsolvedArea += r.getDataRegion().getArea();
104
}
return (totalArea - unsolvedArea) * 100.0/totalArea;
}
/**
* Removes all networks from the MNN that were never tested.
* (This occurs when totalIterations is reached before the problem is solved)
* Test by checking if solvedatapoint is set
*
* @throws NeuralNetworkException
*
Throws exception if unable to add network to the list of
untested.
*/
protected void removeUntestedNetworks() throws NeuralNetworkException {
ModularNeuralNetwork untested = new
ModularNeuralNetwork(trainingMNetwork.getNumInputs(), trainingMNetwork.getNumOutputs(),
trainingMNetwork.getInputRange());
for(NetworkRecord nr: trainingMNetwork) {
if(nr.getSolvedDataPoints() == null) {
untested.addNetwork(nr.getNeuralNetwork(), nr.getDataRegion());
}
}
for(NetworkRecord nr: untested) {
trainingMNetwork.removeNetwork(nr.getNeuralNetwork());
}
}
/**
*
* @param n
*
Solved neural network to add to MNN.
* @param regions
*
List of regions the solved network can solve.
* @throws NeuralNetworkException
*/
protected void queueSolvedNetwork(NeuralNetwork n, List<DataRegion> regions) throws
NeuralNetworkException{
for(DataRegion r: regions) {
trainingMNetwork.addNetwork(n, r, true);
NetworkRecord nr =
trainingMNetwork.getNetworkRecord(trainingMNetwork.getNetworkCount()-1);
nr.setTrainingSetSize(trainingData.getDataInRegion(r).size());
}
}
/**
*
* @param parentNN
*
Used to gather neural network topology (inputs, outputs, hidden
layers)
*
to create new unsolved networks.
* @param regions
*
List of new unsolved regions.
* @throws NeuralNetworkException
*
Thrown on errors setting network input scaling or adding the network
to the queue.
*/
protected void queueUnsolvedNetworks(NeuralNetwork parentNN, List<DataRegion>
regions) throws NeuralNetworkException{
for(DataRegion r: regions) {
NeuralNetwork n = new NeuralNetwork(parentNN.getNumInputs(),
parentNN.getNumOutputs(), parentNN.getHiddenTopology());
105
n.setInputScaling(r, trainingMNetwork.getInputRange());
n.randomizeWeights(0.1);
// used by Back Propogation only
trainingMNetwork.addNetwork(n, r);
networkQueue.add(trainingMNetwork.getNetworkRecord(trainingMNetwork.getNetworkCount()1));
}
}
/**
* Runs through the training set and determines which points are solvable by
* the individual network. At a minimum, all points within the initial
* solved region should be correctly solved.
*
* @param nr
*
Network record containing network and data region associated with it.
* @throws NeuralNetworkException
*
Throws if error creating data sets, calculating network outputs,
*
or if a point in the network's region (i.e. previously solved) is
*
now calculated as unsolved).
*/
protected void setSolvedDataPoints(NetworkRecord nr) throws NeuralNetworkException {
DataSet solved = new DataSet(trainingData.getNumInputs(),
trainingData.getNumOutputs());
DataSet unsolved = new DataSet(trainingData.getNumInputs(),
trainingData.getNumOutputs());
for(DataPair pair: trainingData) {
List<Double> outputs = nr.getNeuralNetwork().getOutputs(pair.getInputs());
// Check each output for correctness
for (int outputNum = 0; outputNum < outputs.size(); outputNum++) {
if (Math.abs(outputs.get(outputNum) - pair.getOutput(outputNum)) >
acceptableErrors[outputNum]) {
// DEBUG check - yell if unsolved within region
if(nr.getDataRegion().containsPoint(pair.getInputs())) {
throw new NeuralNetworkException("Network determined as correct,
but point in region remains unsolved.");
}
unsolved.addPair(pair);
break;
}
solved.addPair(pair);
}
}
if(solved.size() == 0 && nr.isSolved()) {
throw new NeuralNetworkException("Region determined as solved, but no points
in solved set.");
}
nr.setSolvedDataPoints(solved);
nr.setUnsolvedDataPoints(unsolved);
}
/**
* Train an individual neural network.
*
* @param currentNetworkRecord
*
Network record that includes the network to be trained
* @param regionTrainingSet
*
Data points from the training set that fall in the region
* @return True if network converges, False otherwise.
106
* @throws NeuralNetworkException
*
Thrown if error occurs in training network
*/
protected boolean TrainNetwork(NetworkRecord currentNetworkRecord, DataSet
regionTrainingSet) throws NeuralNetworkException {
NeuralNetwork currentNetwork = currentNetworkRecord.getNeuralNetwork();
boolean converged = nn_alg.train(currentNetwork, regionTrainingSet);
currentNetworkRecord.setSolved(converged);
totalIterations += nn_alg.getIterations();
return converged;
}
// <editor-fold defaultstate="collapsed" desc="Event Dispatchers">
/**
* Called before starting to train a new neural network.
*
* @param region
*
Region to be solved by the network
*/
protected final void StartTraining(DataRegion region) {
NetworkTrainingEvent e = new NetworkTrainingEvent(this, region);
dispatchNetworkTrainingEvent(e);
}
/**
* Called by the algorithm after a split occurs.
* Currently, the function updates split counters and send an event to
* the GUI.
*
* @param split
*
Split type that occurred.
* @param unsolvedRegions
*
List of unsolved regions generated from the split.
* @param solvedRegions
*
List of solved regions generated by the split.
* @param splitPoints
*
Points along with the region was split.
* @param bestChunkSize
*
Size of largest chunk generated from the split.
* @param bestSplitDimension
*
Dimension on which splits occurred.
*/
protected final void HandleSplit(SplitType split
,List<DataRegion> unsolvedRegions
,List<DataRegion> solvedRegions
,List<Double> splitPoints
,int bestChunkSize
,int bestSplitDimension) {
NetworkSplitEvent.SplitType networkSplit;
// increment appropriate split count
splitCounts.put(split, splitCounts.get(split)+1);
switch(split) {
case chunk:
networkSplit = NetworkSplitEvent.SplitType.chunk;
break;
case centroid:
default:
networkSplit = NetworkSplitEvent.SplitType.centroid;
break;
107
}
NetworkSplitEvent e = new NetworkSplitEvent(this, bestChunkSize,
networkQueue.size()
,trainingMNetwork.getSolvedCount(), bestSplitDimension, networkSplit
,trainingMNetwork.getNetworkCount(), computeSolvedPercentage()
,unsolvedRegions ,solvedRegions ,splitPoints);
dispatchNetworkSplitEvent(e);
}
/**
* Called by the algorithm after the current network has converged.
*
* @param r
*
Data region solved by network.
* @param iterations
*
Iterations taken to solve the network.
*/
protected final void HandleNetworkConverged(DataRegion r, int iterations) {
NetworkConvergedEvent nce = new NetworkConvergedEvent(this
,r
// solvedRegion
,networkQueue.size()
// networksInQueue
,trainingMNetwork.getNetworkCount() // totalNetworks
,computeSolvedPercentage()
// percentage solved
,iterations);
// iterations
dispatchNetworkConvergedEvent(nce);
}
/**
* Called by the algorithm after all network training is complete.
*
* @param isSolved
*
Has the modular neural network solved the training set?
*/
protected final void HandleTrainingCompleted(boolean isSolved) {
long endTime = System.currentTimeMillis();
// walk through networks and compute solved values
for(int i=0; i < trainingMNetwork.getNetworkCount(); i++) {
NetworkRecord nr = trainingMNetwork.getNetworkRecord(i);
try {
setSolvedDataPoints(nr);
} catch(NeuralNetworkException ex) {
System.out.println("Error in setSolvedDataPoints: " + ex.getMessage());
}
}
TrainingResults tr = new TrainingResults();
tr.setTrainingIterations(totalIterations);
tr.setChunkSplits(splitCounts.get(SplitType.chunk));
tr.setChunksProduced(trainingMNetwork.getNetworkCount());
tr.setCentroidSplits(splitCounts.get(SplitType.centroid));
tr.setSizeFailures(sizeFailures);
tr.setSolved(isSolved);
tr.setTrainingIterations(totalIterations);
tr.setTrainingDuration((endTime - startTime) / 1000);
TrainingCompletedEvent tce = new TrainingCompletedEvent(this, tr);
dispatchTrainingCompletedEvent(tce);
}
108
// </editor-fold>
// <editor-fold defaultstate="collapsed" desc="Listener Methods">
public void addListener(SplitTrainerEventListenerInterface l) {
listeners.add(l);
}
public void removeListener(SplitTrainerEventListenerInterface l) {
listeners.remove(l);
}
public void clearListeners() {
listeners.clear();
}
/**
*
* @param e
*/
protected void dispatchNetworkSplitEvent(NetworkSplitEvent e) {
for (SplitTrainerEventListenerInterface l : listeners) {
l.networkSplit(e);
}
}
/**
*
* @param e
*/
protected void dispatchNetworkConvergedEvent(NetworkConvergedEvent e) {
for (SplitTrainerEventListenerInterface l : listeners) {
l.networkConverged(e);
}
}
/**
*
* @param e
*/
protected void dispatchTrainingCompletedEvent(TrainingCompletedEvent e) {
for (SplitTrainerEventListenerInterface l : listeners) {
l.trainingCompleted(e);
}
}
/**
*
* @param e
*/
protected void dispatchNetworkTrainingEvent(NetworkTrainingEvent e) {
for (SplitTrainerEventListenerInterface l : listeners) {
l.networkTraining(e);
}
}
// </editor-fold>
}
/* ===================================================================
edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithmInterface.java
=================================================================== */
109
package edu.csus.ecs.ssnn.splittrainer;
import edu.csus.ecs.ssnn.event.SplitTrainerEventListenerInterface;
import edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface;
import edu.csus.ecs.ssnn.nn.*;
/**
* Interface for SSNN Training algorithms. Interfaces must implement train()
* and provide for other components to attach event listeners
*
*/
public interface SSNNTrainingAlgorithmInterface {
/**
*
* @param n
*
Empty modular neural network. It defines the structure of the
*
individual neural networks, as well as maintaining internal
*
lists of networks and other information on the state of training.
* @param trainingData
*
The training set.
* @param a
*
The training algorithm used to train individual neural networks.
* @return True if trained successfully, false otherwise.
* @throws NeuralNetworkException
*
Thrown on errors in getting region specific data, training an
*
individual network, or splitting the network.
*/
public boolean train(ModularNeuralNetwork n, DataSet trainingData,
TrainingAlgorithmInterface a) throws NeuralNetworkException;
/**
* Add passed listener to collection. The event is triggered on splits.
* @param l
*
Event listener
*/
public void addListener(SplitTrainerEventListenerInterface l);
/**
* Removes passed listener from collection.
* @param l
*
Event listener
*/
public void removeListener(SplitTrainerEventListenerInterface l);
/**
* Clears all listeners attached to the interface
*/
public void clearListeners();
}
/* ===================================================================
edu.csus.ecs.ssnn.splittrainer.TrainedResultsSplitTrainer.java
=================================================================== */
package edu.csus.ecs.ssnn.splittrainer;
import edu.csus.ecs.ssnn.nn.*;
import java.util.ArrayList;
import java.util.List;
110
public class TrainedResultsSplitTrainer extends SSNNTrainingAlgorithm {
public TrainedResultsSplitTrainer() {
super();
}
/* Basic SSNN Splitting algorithm
* Find largest solved chunk and split network into either:
* a) [solved] [unsolved]
(if solved is on an edge of region)
* b) [unsolved] [solved] [unsolved]
(if solved in the middle of region)
* if a sufficiently large chunk cannot be found does a centroid split.
*/
@Override
protected SplitType splitNetwork(NeuralNetwork n, DataRegion r, DataSet d) throws
NeuralNetworkException {
Chunk best = findBestChunk(n, d);
// Now determine what kind of split to do...
if (best.length >= minChunkSize) {
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(best.dimension);
double domainMin = sortedData.getPair(0).getInput(best.dimension);
double domainMax = sortedData.getPair(sortedData.size() 1).getInput(best.dimension);
// Somehow our chunk covers the entire domain
/*
* It's actually possible for this to happen, although it is
* extremely rare. If the training algorithm does not converge, but
* its very last adjustment to the network made it perfect, this
* case will occur.
*/
if (best.start == domainMin && best.end == domainMax) {
return SplitType.unnecessary;
}
// Are we on the lower or upper edge?
else if (best.start == domainMin || best.end == domainMax) {
// Make sure there are at least two distinct values in the chunk
if (countDistinctValuesInChunk(best.start, best.end, sortedData,
best.dimension) < 2) {
// Can't split (too few values) - try centroid
return centroidSplit(n, r, d);
} else {
// split based on min/max
List<DataRegion> splitRegions;
ArrayList<Double> splitPoints = new ArrayList<Double>();
ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>();
ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>();
// lower half is solved
if(best.start == domainMin) {
splitRegions = r.split(best.dimension, best.end);
unsolvedRegions.add(splitRegions.get(1));
solvedRegions.add(splitRegions.get(0));
splitPoints.add(best.end);
//upper half is solved
} else {
splitRegions = r.split(best.dimension, best.start);
111
unsolvedRegions.add(splitRegions.get(0));
solvedRegions.add(splitRegions.get(1));
splitPoints.add(best.start);
}
// Remove existing network
trainingMNetwork.removeNetwork(n);
// Re-add solved region
queueSolvedNetwork(n, solvedRegions);
// create new network for the unsolved region
queueUnsolvedNetworks(n, unsolvedRegions);
// Create and dispatch split event
HandleSplit(SplitType.chunk,
unsolvedRegions, solvedRegions, splitPoints,
best.length, best.dimension);
return SplitType.chunk;
}
}
// We're in the middle of the data set
else {
// Need at least three distinct values in the chunk
if (countDistinctValuesInChunk(best.start, best.end, sortedData,
best.dimension) < 3) {
// Can't split - try centroid
return centroidSplit(n, r, d);
} else {
// split data into [unsolved] [solved] [unsolved]
ArrayList<Double> splitPoints = new ArrayList<Double>();
splitPoints.add(best.start);
splitPoints.add(best.end);
List<DataRegion> firstSplit = r.split(best.dimension, best.start);
List<DataRegion> secondSplit =
firstSplit.get(1).split(best.dimension, best.end);
ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>();
unsolvedRegions.add(firstSplit.get(0));
// low unsolved region
unsolvedRegions.add(secondSplit.get(1));
// high unsolved region
ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>();
solvedRegions.add(secondSplit.get(0));
// central solved region
// Remove existing network
trainingMNetwork.removeNetwork(n);
// Re-add solved region
queueSolvedNetwork(n, solvedRegions);
// create new networks for the unsolved upper / lower regions
queueUnsolvedNetworks(n, unsolvedRegions);
// Create and dispatch split event
HandleSplit(SplitType.chunk,
unsolvedRegions, solvedRegions, splitPoints,
best.length, best.dimension);
return SplitType.chunk;
112
}
}
// No chunk or chunk too small
} else {
sizeFailures++;
return centroidSplit(n, r, d);
}
}
/*
* Splits Dataset across 2 neural networks with no solved regions.
*/
private SplitType centroidSplit(NeuralNetwork n, DataRegion r, DataSet d) throws
NeuralNetworkException {
// To select the dimension to split along, find the one with the most
// distinct values
int bestDimension = 0;
int mostDistinctValues = 0;
for (int inputNum = 0; inputNum < n.getNumInputs(); inputNum++) {
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(inputNum);
int currDistinctValues = countDistinctValuesInChunk(
sortedData.getPair(0).getInput(inputNum),
sortedData.getPair(sortedData.size() - 1).getInput(inputNum),
sortedData,
inputNum);
if (currDistinctValues > mostDistinctValues) {
mostDistinctValues = currDistinctValues;
bestDimension = inputNum;
}
}
if (mostDistinctValues > 2) {
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(bestDimension);
int splitPoint = mostDistinctValues / 2;
// creating solved regions list (empty), unsolved regions, and split points
list
ArrayList<Double> splitPoints = new ArrayList<Double>();
splitPoints.add(sortedData.getPair(splitPoint).getInput(bestDimension));
List<DataRegion> unsolvedRegions = r.split(bestDimension, sortedData
.getPair(splitPoint).getInput(bestDimension));
List<DataRegion> solvedRegions = new ArrayList<DataRegion>();
// removing original network - replaced by new split networks
trainingMNetwork.removeNetwork(n);
// create new networks for the unsolved upper / lower regions
queueUnsolvedNetworks(n, unsolvedRegions);
HandleSplit(SplitType.centroid,
unsolvedRegions, solvedRegions, splitPoints,
0, bestDimension);
return SplitType.centroid;
}
// No way to split
return SplitType.impossible;
113
}
/*
* Count all distinct values in the dimension that fall between min / max
*/
private int countDistinctValuesInChunk(double minValue, double maxValue,
DataSet sortedData, int chunkDimension) {
int count = 0;
double lastValue = Double.MIN_VALUE;
for (DataPair p : sortedData) {
double curValue = p.getInput(chunkDimension);
if (curValue > maxValue) {
break;
}
if (curValue >= minValue) {
if (curValue != lastValue) {
lastValue = curValue;
count++;
}
}
}
return count;
}
/*
* Finds largest solved chunk across all domains
*/
private Chunk findBestChunk(NeuralNetwork n, DataSet d) throws NeuralNetworkException
{
Chunk current;
Chunk best = new Chunk();
// For each input dimension, find largest solved chunk
for (int inputNum = 0; inputNum < d.getNumInputs(); inputNum++) {
// Create a dataset which is sorted on the input dimension
DataSet sortedData = (DataSet) d.clone();
sortedData.sortOnInput(inputNum);
current = new Chunk();
// reset chunk for new dimension
boolean failed;
double lastFail = Double.MIN_VALUE;
double lastGood = Double.MIN_VALUE;
// track last edge that failed
// track last edge that passed
for (int pairNum = 0; pairNum < sortedData.size(); pairNum++) {
DataPair currentPair = sortedData.getPair(pairNum);
// fast-forward through points with same dimensional value if original
failed
// (don't want to split on a dimensional point with a mix of good/bad
values)
if(pairNum > 0 && current.length == 0 && currentPair.getInput(inputNum)
== lastFail) {
continue;
}
List<Double> outputs = n.getOutputs(currentPair.getInputs());
// Check each output for correctness against expected
failed = false;
114
for (int outputNum = 0; outputNum < outputs.size(); outputNum++) {
// output is incorrect, reset chunk
if (Math.abs(outputs.get(outputNum) currentPair.getOutput(outputNum)) > acceptableErrors[outputNum]) {
failed = true;
break;
}
}
if (failed) {
lastFail = currentPair.getInput(inputNum);
// only bother with the rest if currently have a chunk of passing
points
if(current.length > 0) {
// if point has same dimensional value as last good point, then
we can't use this point
// must revert back to previous
if(current.end == currentPair.getInput(inputNum)) {
current.length--;
current.end = lastGood;
}
// found a new best!
if(current.length > best.length) {
best = current;
}
current = new Chunk();
// reset chunk
}
}
// output is correct, increment chunk (if not already added for that
dimensional value)
// only want 1 entry for each dimensional value
else if(currentPair.getInput(inputNum) != current.end) {
current.length++;
lastGood = current.end;
current.end = currentPair.getInput(inputNum);
if(current.length == 1) {
// this is a new chunk
current.start = current.end;
current.dimension = inputNum;
}
}
}
// extra check to make sure last chunk wasn't best
if(current.length > best.length) {
best = current;
}
}
return best;
}
}
// best should now hold best chunk values
115
BIBLIOGRAPHY
[Fowler] A. Fowler. A Swing Architecture Overview.
<http://java.sun.com/products/jfc/tsc/articles/architecture/>. Accessed 5/10/2010.
[Gordon1] V. Gordon and J. Crouson. 2008. Self-Splitting Modular Neural Network - Domain
Partitioning at Boundaries of Trained Regions. 2008 International Joint Conference on
Neural Networks, Hong Kong.
[Gordon2] V. Gordon. 2008. Neighbor Annealing for Neural Network Training. 2008
International Join Conference on Neural Networks, Hong Kong.
[Gordon3] V. Gordon, M. Daniels, J. Boheman, M. Watstein, D.Goering, and B. Urban.
2009. Visualization Tool for a Self-Splitting Neural Network. 2009 International Join
Conference on Neural Networks, Atlanta GA.
[Gordon4] T. Bender, V. Gordon, and M. Daniels. 2009. Partitioning Strategies for Modular
Neural Networks. 2009 International Joint Conference on Neural Networks (IJCNN
2009), Atlanta GA.
[Hu]
X. Hu. Particle Swarm Optimization: Tutorial. 2006.
<http://www.swarmintelligence.org/tutorials.php>. Accessed 6/1/2008.
[Lu]
X. Lu, N. Bourbakis. 1998. IEEE Internation Join Symposia on Intelligence and
Systems.
[Rojas]
R. Rojas. Neural Networks - A Systematic Introduction. Springer, 1996.
[Russell]
S. Russell, P. Norvig. Artificial Intelligence - A Modern Approach. New Jersey,
2003.
Download