x SELF-SPLITTING NEURAL NETWORK VISUALIZATION TOOL ENHANCEMENTS Ryan Joseph Norton B.S., University of California, Davis, 2004 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE at CALIFORNIA STATE UNIVERSITY, SACRAMENTO FALL 2010 SELF-SPLITTING NEURAL NETWORK VISUALIZATION TOOL ENHANCEMENTS A Project by Ryan Joseph Norton Approved by: __________________________________, Committee Chair V Scott Gordon, Ph.D. __________________________________, Second Reader Behnam Arad, Ph.D. ____________________________ Date ii Student: Ryan Joseph Norton I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the Project. __________________________, Graduate Coordinator ________________ Nikrouz Faroughi, Ph.D. Date Department of Computer Science iii Abstract of SELF-SPLITTING NEURAL NETWORK VISUALIZATION TOOL ENHANCEMENTS by Ryan Joseph Norton Self-splitting neural networks provide a new method for solving complex problems by using multiple neural networks in a divide-and-conquer approach to reduce the domain space each network must solve. However, choosing optimal points for splitting the domain is a difficult problem. A visualization tool exists to help understand how splitting occurs in the self-splitting neural network. This project provided several new enhancements to the tool to expand its scope and improve existing functionality. These enhancements included a new extensible framework for adding additional learning methods to the algorithm, integrating enhancements to the algorithm that had been discovered since the original tool was released, and several new features for observing how the domain space is partitioned. These modifications can be used to develop further insights into the splitting and training processes. _______________________, Committee Chair V Scott Gordon, Ph.D. _______________________ Date iv TABLE OF CONTENTS Page List of Tables ................................................................................................................................. vii List of Figures ............................................................................................................................... viii Chapter 1. INTRODUCTION ..................................................................................................................... 1 2. BACKGROUND ....................................................................................................................... 2 2.1 Neural Networks .............................................................................................................. 2 2.2 Neural Network Training Algorithms ............................................................................. 2 2.3 Self-Splitting Neural Networks ....................................................................................... 5 2.4 Technologies.................................................................................................................... 7 3. VISUALIZATION ENHANCEMENTS................................................................................... 9 3.1 Training Options .............................................................................................................. 9 3.2 Rewind / Replay Functionality ...................................................................................... 10 3.3 Domain Scaling ............................................................................................................. 11 3.4 Normalization / Grayscale ............................................................................................. 11 3.5 Logging Functionality ................................................................................................... 13 4. SOFTWARE DESIGN ............................................................................................................ 14 4.1 Class Diagrams .............................................................................................................. 14 5. PRELIMINARY RESULTS ................................................................................................... 19 6. CONCLUSIONS ..................................................................................................................... 31 7. FUTURE WORK .................................................................................................................... 33 Appendix A .................................................................................................................................... 35 v 1. Installation steps ............................................................................................................ 35 2. User Guide ..................................................................................................................... 35 3. Steps to add a new scenario ........................................................................................... 35 3. Steps to add a new training algorithm ........................................................................... 38 5. Steps to add a new splitting algorithm........................................................................... 39 6. Steps to add a new fitness function................................................................................ 40 Appendix B .................................................................................................................................... 41 Source Listing.......................................................................................................................... 41 Selected Source Code .............................................................................................................. 42 Bibliography ................................................................................................................................ 115 vi LIST OF TABLES Page 1. Comparison of PSO run ......................................................................................................... 28 2. Grayscale outputs of various training algorithms. ................................................................. 34 vii LIST OF FIGURES Page 1. Velocity formula for PSO. ....................................................................................................... 4 2. Trained Region Algorithm ....................................................................................................... 6 3. Area-Based Binary Splitting .................................................................................................... 7 4. Dialog window for setting custom parameters. ..................................................................... 10 5. Replay pane in the GUI.......................................................................................................... 11 6. Screenshot of global and individual grayscale images. ......................................................... 12 7. Modular Neural Network Classes. ......................................................................................... 15 8. Splitting and Training Classes. .............................................................................................. 17 9. User Interface Classes. ........................................................................................................... 18 10. Generalization Rates for Splitting and Training Algorithms. ................................................ 29 11. Network Size for Splitting and Training Algorithms. ............................................................ 30 12. Generalization Rate for Fitness Functions on PSO. ............................................................... 30 13. Sample scenario file. .............................................................................................................. 36 14. Sample training set. ................................................................................................................ 37 15. Sample XML Results file....................................................................................................... 37 viii 1 Chapter 1 INTRODUCTION Modular neural networks provide a divide and conquer approach to solving problems that a single neural network is unable to solve. Because of the "black box" nature of neural networks, it can be difficult to understand the underlying nature and unintended side effects of using multiple networks together. An improved understanding of the various characteristics and generalization ability of these networks may lead to insights in improving the algorithms used. With this motivation, a visualization tool was developed in 2008 as a senior project by Michael Daniels, James Boheman, Marcus Watstein, Derek Goering, and Brandon Urban [Gordon3]. It provided a graphical view into several aspects of the splitting and training algorithms. In particular, the tool attempted to provide details on the order of the domain partitioning by the splitting algorithm and the area covered by each individual network. This project provided several additions to the tool to increase its extensibility, add multiple splitting and training algorithms, provide more information on the individual networks, and integrate several improvements to enhance the tool’s reporting functionality. 2 Chapter 2 BACKGROUND 2.1 Neural Networks Neural networks arose through work in artificial intelligence on simulating neuron functions in the brain [Russell]. The basic structure of a neural network is a directed graph of neurons (vertices) connected via weights (edges). Values are entered on the input nodes, undergo a series of additions and multiplications as they pass through the network structure, and the final result is saved to the output nodes. Adjustments to the weight values change the data calculations throughout the network and therefore the final values. For a standard feedforward network, training is done by running a set of training data consisting of inputs with known outputs through the network and attempting to determine a set of weights that causes the network results to sufficiently approximate the known outputs. This approach is called supervised learning. While neural networks do not provide exact outputs, they can provide close approximations. Once trained, they run quickly and often generalize well to provide correct outputs for non-training data. This is useful in situations where an exact algorithm cannot be determined, or runs too slowly to be useful. 2.2 Neural Network Training Algorithms The primary complexity in neural networks is in the training algorithm that adjusts the weights. Backpropagation is a widely used method. It uses a two step approach in which the output error is calculated for each item in the training set and fed backwards through the network using stochastic gradient descent [Rojas]. 3 The process of adjusting weights to optimize outputs can be easily mapped to a variety of optimization and search algorithms, and several alternative training approaches have been researched. These include genetic algorithms [Lu], particle swarm optimization [Hu], and neighbor annealing [Gordon2]. All of these algorithms require feedback (also known as a fitness function) about the correctness of their current solution, so that they know when adjustments to the weights are more or less optimal. One basic approach is a minimum fitness function that sums the number of training data points that generated outputs outside acceptable error. Weights that generate fewer erroneous outputs are more “fit”. 2.2.1 Genetic Algorithms Genetic algorithms were designed to simulate a simple "survival of the fittest" evolutionary model, where individuals with characteristics that made them more fit were more likely to pass on portions of their solution to subsequent generations [Russell]. After calculating fitness for each individual, a selection methodology is used to pick individuals to move on to the next generation. This selection methodology is weighted towards individuals with better fitness; most implementations allow the same individual to be chosen more than once. In some variations, the most fit individual found in any generation is guaranteed a selection -- this is known as elitism. Once the next generation is chosen, individuals are paired up and portions of their solutions are swapped at randomly chosen crossover points. Finally, each element of the individual solution has a small chance to undergo a mutation to a new random value. As the genetic algorithm runs, individuals with better fitness show up more frequently, leading to more similar individuals that search a smaller portion of the solution space. 4 2.2.2 Particle Swarm Optimization Particle swarm optimization (PSO) attempts to simulate swarm intelligence seen in the flocking behavior found in animals such as birds and bees [Hu]. Particles within the swarm gravitate towards better solutions, searching these areas more thoroughly. Individual particles track their current position, velocity and the best position (pbest) they have discovered. A global best position (gbest) is also tracked. At each iteration, the particle velocity is updated using the following formula [Hu]: p.velocity = c0 * p.velocity + c1 * R1 * (p.best - p.position) + c2 * R2 * (g.best - p.position) where c0, c1, c2 are constants and R1 and R2 are randomly chosen from [0.0, 1.0]. Figure 1. Velocity formula for PSO. The new velocity is then used to update the particle's position. Like genetic algorithms, in later iterations of the PSO algorithm individual solutions become clustered around the current best solution, looking for slight improvements. 2.2.3 Neighbor Annealing Neighbor annealing is a variation of simulated annealing [Gordon2]. A random position in the search space is chosen. At each iteration of the algorithm, a neighboring position is randomly chosen. If the neighboring position contains a better solution than the current one, it becomes the new current position. An annealing schedule is used to adjust how far away the neighbor is allowed to be. Initially the neighborhood size covers the search space, allowing the algorithm to jump anywhere. At each iteration the neighborhood size is decreased, eventually reducing to a form of hill-climbing. 5 2.3 Self-Splitting Neural Networks The training phase for the neural network is not guaranteed to find an acceptable set of weights to solve the training data. Sometimes the network is unable to effectively generalize due to complexity of the training data, insufficient training time, or limitations in the network structure (i.e. the network lacks sufficient nodes to come up with a realistic model). Modular neural networks address these issues by partitioning the input domain between several neural networks. Self-splitting neural networks automate this division process. The ideal splitting algorithm should provide the best generalization possible while limiting the number of networks created [Gordon4]. Several splitting approaches have been proposed. The following are implemented in the visualization tool: 1. Centroid splitting finds the domain dimension with the most distinct values and splits into roughly equivalent pieces. This approach attempts to equally halve the training set, without consideration for any partially solved regions of the set. This can be problematic for networks that are close to learning the entire training set, as splitting the set in half may break up the points that led to the solution. 2. Trained region attempts to split the set based off the network performance, by ensuring the largest subset of contiguously solved points in a single dimension, called a chunk, is not split apart. The split occurs based on where the chunk falls in the training set, using the algorithm in Figure 2 [Gordon 1]. The smaller unsolved regions should be easier for the new networks to solve, as they contain less points and therefore less complexity. 6 for each dimension d sort training set on d scan each point in the sorted set to determine range of largest contiguously solved points (chunk) if chunk is too small do centroid split else if chunk falls on an edge of the training set (i.e. [chunk] [unsolved region]) create solved network for chunk create unsolved network for unsolved region with randomized weights else if chunk falls in the middle of the training set (i.e. [unsolved region 1] [chunk] [unsolved region 2]) create solved network for chunk create unsolved network for unsolved region 1 with randomized weights create unsolved network for unsolved region 2 with randomized weights Figure 2. Trained Region Algorithm Each new unsolved network in trained region splitting starts with randomized weights. For cases where the network was close to solving the training set, the new network may waste a lot of cycles just getting close to the previous network. 3. Area-based binary splitting tries to solve this problem by adjusting the trained region algorithm. For cases where the chunk falls in the middle of the training set, the algorithm in Figure 3 is used [Gordon4]. 7 else if chunk falls in the middle of the training set (i.e. [unsolved region 1] [chunk] [unsolved region 2]) create unsolved network for ([smaller unsolved region] + [chunk]), starting with weights from parent network create unsolved network for [larger unsolved region] with randomized weights Figure 3. Area-Based Binary Splitting For a network that has nearly solved the training set, the combination of fewer unsolved points and more cycles to fine-tune the weights should improve the network's ability to solve the set. In the worst case, area-based binary splitting may take extra cycles and do the same split as trained region splitting. 2.4 Technologies The visualization tool was initially developed in Java using the Swing toolkit. This approach was kept for this project, as significant development effort would be required to transition to a new language or toolkit without an obvious benefit. Java allowed the tool to be developed for deployment without regard for target platforms - this is handled natively by the Java interpreter. Swing is a widget toolkit included as part of the Java Foundation Classes to provide a graphical user interface API. It provides a large set of cross-platform GUI components that provide a consistent look and feel. Swing components also proved to be highly customizable, allowing for fast and straightforward development of the GUI interface [Fowler]. The program was grouped into several Java packages that cover the various splitting and training algorithms, the graphical user interface (GUI), and the underlying modular neural network structure. These packages went through a large amount of refactoring over the course of 8 the project; the final structures can be seen in more detail in (Section 4.1- Class Diagrams). Communication between the GUI, splitting, and training packages were handled by interfaces and event handlers, but did not account for the extra parameters required by different algorithms. 9 Chapter 3 VISUALIZATION ENHANCEMENTS 3.1 Training Options To improve the usefulness of the visualization tool, a large section of the underlying framework was rewritten to allow for adding new algorithms. These include splitting algorithms, neural network training algorithms, and fitness functions. An interface was expanded upon or developed for each, along with corresponding hooks into the GUI to enable the end user to choose the appropriate algorithm and load or save pre-built scenario files. For more detail on the changes, see (Section 4 - Software Design). Refer to Appendix A for steps to add new algorithms to the program. 10 Figure 4. Dialog window for setting custom parameters. 3.2 Rewind / Replay Functionality As the training set is partitioned into smaller sets the amount of time required to solve each subset tends to decrease. This makes observing the order of splitting difficult towards the end of the training run. A slider was added to the GUI to allow the user to rewind or fast-forward through individual networks of the last trained self-splitting neural network (SSNN) and observe various characteristics of the currently selected network. 11 Figure 5. Replay pane in the GUI. 3.3 Domain Scaling In the initial implementation of the visualization tool, inputs to the individual networks were not scaled separately. This can make solving networks with very close training subsets difficult, as the network must make very small changes in weights. The tool was modified to scale each training subset input to [0,1] to solve this problem. For special cases where one dimension contains all the same value (causing infinite scaling and divide by zero issues), the scaling factors from the split network are assigned to the new networks. 3.4 Normalization / Grayscale While each individual network has a specified subset of the input domain, networks that 12 generalized well may actually be able to solve a larger portion of the training set. Insights in this area may lead to creating better splitting algorithms that can take advantage of how much each individual network is really capable of solving. A previous update to the visualization tool had provided grayscale results for the SSNN [Gordon4]. Each pixel in the viewing windows was fed into the SSNN and the output scaled to a grayscale integer value. The resulting image provided an easy to understand visual mapping of the network's outputs across the domain. This grayscale enhancement was merged with the new normalization code, allowing the user to also see how each individual network in the SSNN would solve the entire domain. Figure 6. Screenshot of global and individual grayscale images. 13 3.5 Logging Functionality While the primary usage of the visualization tool involves direct manipulation by the user, it is also useful to gather details on each run for later analysis. A logging system was built to dump information on run settings and testing results to a comma-separated values (CSV) file. The log provides enough data to recreate the same run later if the user finds something they want to revisit. It also allows for easy aggregation and plotting of the data. 14 Chapter 4 SOFTWARE DESIGN Several interfaces and abstract classes were developed to improve the code modularity. This streamlines the process of adding new training, splitting, and fitness algorithms and reduces the amount of coding required for future enhancements. Training algorithms are all located in the edu.csus.ecs.ssnn.nn.trainer package. At a minimum, all training algorithms are required to implement the TrainingAlgorithmInterface. This defines the functions required for training and reporting results. All the current training algorithms also extend the TrainingAlgorithm base class, which provides several common helper functions. For providing GUI options around each training algorithm, edu.csus.ecs.ssnn.ui.TrainingParametersDialog.java provides the front-end dialog box, while edu.csus.ecs.ssnn.data.TrainingParameters.java defines the backend code used to create new training algorithm objects. A generic Params object is used in the TrainingParameters.java file to reduce the change required to add new parameters - only the dialog and training algorithm files need to be adjusted. To make the same options work with saved and loaded scenario files, edu.csus.ecs.ssnn.fileio.SSNNDataFile.java contains code to push/pull the options to/from XML. 4.1 Class Diagrams Three subsections of the class diagram are provided in the following figures. These sections constitute the core functionality of the program – covering the main user interface, the modular neural network, and the splitting and training classes. 15 Figure 7. Modular Neural Network Classes. 16 17 Figure 8. Splitting and Training Classes. 18 Figure 9. User Interface Classes. 19 Chapter 5 PRELIMINARY RESULTS After the additional algorithms were implemented in the tool, several sample runs were done to demonstrate how the tool can be used to gather data on the various algorithms. Initial tests were run on the two-spiral problem as it is a difficult problem for neural networks to solve [Gordon3] and provides simple, easy to generate data sets. Runs were done with Particle Swarm Optimization, area based splitting, largest chunk fitness, with a hard-coded seed and all other settings left on their default values. Run details and visualization images can be seen in Table 1. Run 1 solved quickly, but all networks were simple and most encompassed a single portion of one spiral. Individual networks showed simplistic grayscales images. Adding a second 4-node hidden layer for run 2 saw individual networks begin creating more complex patterns. A third 4-node hidden layer was added for run 3. It did not improve the final SSNN. Total networks increased while the generalization rate went down. There were several cases of the network generating unnecessary complexity for the region it was covering. In this case, it appears the network's extra complexity worked against it, as the final individually solved networks did not cover appreciably larger areas and their extra complexity reduced the ability for the training algorithm to generalize correctly, as seen in the grayscale images of the individual networks. For run 4 iterations were increased from 300,000 to 1,000,000. This had a positive effect on the three layer network, as networks produced went down and several networks covered larger areas. However, grayscale images did not show a noticeable increase in complexity. 20 Run 5 attempted the same number of iterations on the two layer network, which decreased its performance. This could be due to the more simplistic neural network being unable to generate a sufficiently complex model to cover larger regions. Because these tests were using area based splitting, which causes some new networks to start off with current weight values, they may have started with an overly complex model for the new, smaller region, leading to a lower generalization rate. Further research is required to determine this. For run 6, iterations were increased to 2,000,000. This did not reduce the networks produced, but the extra iterations clearly took advantage of the network's extra complexity to model more accurate curves along larger regions. Run 7 increased iterations to 3,000,000. The network count was reduced, but generalization went down. The networks were increased to four layers each with four nodes for run 8. This did not improve generalization, but the networks appear to start covering larger, more complex regions. Since total networks do not significantly decrease, there may be a shift towards both larger and smaller region networks, reducing the uniformity in network size found in the fewer layer networks. Further analysis of the training log could yield more information on the variation of region sizes between the different networks. 21 Run 1 Network One 4-node hidden layer Max Iterations 300,000 Generalization .98769 Network Count 59 Results Duration: 19.0 sec. Total iterations: 17561305 Chunk splits: 56 Centroid splits: 2 Size failures: 2 22 Run 2 Network Two 4-node hidden layers Max Iterations 300,000 Generalization 0.99148 Network Count 57 Results Duration: 31.0 sec. Total iterations: 17509249 Chunk splits: 55 Centroid splits: 1 Size failures: 1 23 Run 3 Network Three 4-node hidden layers Max Iterations 300,000 Generalization 0.98580 Network Count 62 Results Duration: 51.0 sec. Total iterations: 19682785 Chunk splits: 61 Centroid splits: 0 Size failures: 0 Generalization rate: 24 Run 4 Network Three 4-node hidden layers Max Iterations 1,000,000 Generalization 0.98106 Network Count 51 Results Duration: 133.0 sec. Total iterations: 52336342 Chunk splits: 50 Centroid splits: 0 Size failures: 0 25 Run 5 Network Two 4-node hidden layers Max Iterations 1,000,000 Generalization 0.98580 Network Count 62 Results Duration: 114.0 sec. Total iterations: 61591067 Chunk splits: 59 Centroid splits: 2 Size failures: 2 No images collected 26 Run 6 Network Three 4-node hidden layers Max Iterations 2,000,000 Generalization 0.99053 Network Count 59 Results Duration: 335.0 sec. Total iterations: 119159666 Chunk splits: 58 Centroid splits: 0 Size failures: 0 27 Run 7 Network Three 4-node hidden layers Max Iterations 3,000,000 Generalization 0.98769 Network Count 52 Results Duration: 399.0 sec. Total iterations: 159121244 Chunk splits: 51 Centroid splits: 0 Size failures: 0 No images collected 28 Run 8 Network Four 4-node hidden layers Max Iterations 3,000,000 Generalization .98300 Network Count 54 Results Duration: 523.0 sec. Total iterations: 160727717 Chunk splits: 50 Centroid splits: 3 Size failures: 3 Table 1. Comparison of PSO run 29 The logging system was used to track several runs of each algorithm for comparison purposes. The sample size was small – ten runs on each algorithm – but the intent was to demonstrate how run statistics can be quickly gathered and synthesized into useful information. The run log was used to collect run data for each combination of the available splitting and training algorithms (using default splitting and training settings). Figure 10. Generalization Rates for Splitting and Training Algorithms. 30 Figure 11. Network Size for Splitting and Training Algorithms. A second set of runs was done on PSO comparing the fitness function performance. Again, all default settings were used and ten runs were done on each setting. Figure 12. Generalization Rate for Fitness Functions on PSO. 31 Chapter 6 CONCLUSIONS The project required a large amount of code refactoring to improve the program's extensibility and add all the features mentioned above. Much of the first half of the project was spent defining a logical class hierarchy and moving logic to the appropriate location. Over the course of this project, several changes were made as features were tested and found to have varying usefulness. Domain scaling in particular went through several iterations. Testing with the initial implementation found networks solved poorly or had errors in situations where a particular network had a domain dimension consisting of one point (i.e. {(1.5, -10.0), (1.5, 10.)}). Attempting to scale to [1.5, 1.5] caused a loss of information in that dimension, and any attempts to scale back out caused a divide by zero error. The solution was to assign the same scaling used by the parent network when this situation arose in the inputs. Initially, outputs were also scaled. This proved to be less useful for some training sets, such as the two spiral problem, because training outputs were either 0 or 1. Training subsets output domains were {0}, {1}, or {0,1}, and scaling a subset of all zeroes (or ones) to [0,1] caused the error calculation to become meaningless. In general, a set of training data with identical output leads to a useless network, as it usually generalizes to either all inputs passing or all inputs failing. Future splitting algorithms may want to ensure that splits do not leave any training subsets with only one distinct output. Initial testing of the domain scaling code showed a marked improvement in training. Without scaling, training would occasionally fail when network splits would lead to a grouping of 32 three or four closely spaced points with differing outputs. While networks can still fail, with domain scaling the frequency is much less. The normalization code that tested how well an individual network solved all points in the training set also went through a few iterations. Initially it just displayed what points from the training set were able to be solved. This turned out to be pretty useless, as no real insight could be gained by looking at a scattering of points across the domain. However, applying this code to the grayscale image generation provides a much better sense of what the network solution looks like, and should prove to be a useful feature. The replay functionality was very useful over the course of this project. The ability to look closely at an individual network made debugging the new splitting and training algorithms much easier – several subtle errors were caught by reviewing the output of individual networks. While this was not the primary purpose of the feature, it demonstrated how a different view on the data can provide new insights. 33 Chapter 7 FUTURE WORK This project focused on expanding the capabilities and improving the extensibility of the visualization tool. These enhancements allow for a variety of new research; possibilities include: A splitting algorithm that ensures all split regions contain at least two distinct output values. Performance of a sum-squared error fitness function. Analyzing and comparing new neural network training algorithms. Analyzing and comparing new modular neural network splitting algorithms. Analyzing and comparing new fitness functions. The visualization tool provides a framework for future research into splitting, training, and fitness algorithms and their affect on self-splitting neural networks. Over the course of this project, several new algorithms were added and tested. Several example images generated by the different training algorithms are shown below in Table 2. However, at this point more data is needed to characterize any differences. 34 Particle Swarm Genetic Algorithm Neighbor Annealing Optimization Run 1 Run 2 Table 2. Grayscale outputs of various training algorithms. There are also a few remaining enhancements that could prove useful. The tool can only display two input dimensions and one output dimension, on higher dimension problems the user should have functionality to choose which dimensions to draw. There may also be better ways to display higher dimension problems. Also, fitness functions are not fully integrated with the logging and pre-defined scenario files – improving this process would allow more detailed logging and reduce the steps necessary to setup a new training run. 35 APPENDIX A 1. Installation steps Run all commands from SSNN\bin directory. Compile: javac -d "." -classpath "..\lib\jdom.jar;..\lib\swing-layout-1.0.3.jar" -sourcepath "..\src" "..\src\edu\csus\ecs\ssnn\ui\MainGUI.java" Create JAR: jar cvf vis.jar edu Run: java -classpath "..\lib\jdom.jar;..\lib\swing-layout-1.0.3.jar;vis.jar" edu.csus.ecs.ssnn.ui.MainGUI 2. User Guide The program should be run with the included SSNN\bin directory as the working directory. This directory should contain the following files and folders: manual.xml - used to populate the embedded manual in the program. scenarios\ scenario" - holds scenario files, detailed in "Steps to add a new A manual is provided with the program. It can be accessed from the menu via Help -> Manual. The manual provides information on the individual UI elements, as well as explanations of the various algorithms used. It is pulled from the manual.xml file located in the /bin folder. 3. Steps to add a new scenario Scenarios are used to load predefined algorithm settings and data sets, reducing the number of steps needed to kick off training runs. The following example shows how to add a new scenario called "My scenario". 36 1. Create a new folder in SSNN\bin\scenarios\ called "My scenario". # Name to display Name= My scenario # Description to display Description= Sample scenario # Training data set TrainingData= myTrainingSet.dat # Testing data set TestingData= myTestingSet.dat # XML file that contains information on preset algorithms and parameters. ResultsFile= myscenario.xml # max/min vals for dimensions of training/testing sets. ScaleIn0= -10,10 ScaleIn1= -10,10 ScaleOut0= 0,1 # Description of network, each layer is described: # hiddenTopology= num_nodes_in_layer # e.g. this is a 1 layer 4 node Neural Network: hiddenTopology= 4 Figure 13. Sample scenario file. 2. Create a text file called scenario.txt with the following content: 37 3. Provide training and testing data file (in this case named myTrainingSet.dat and myTestingSet.dat). These files provide a simple list of training/testing points, provided as a space delimited list of inputs followed by outputs: 9.97616 0.35630 1 -9.97616 -0.35630 0 9.93965 0.71090 1 -9.93965 -0.71090 0 9.89056 1.06335 1 -9.89056 -1.06335 0 9.82900 1.41320 1 Figure 14. Sample training set. 4. Provide the XML results file. It provides details on the various preset options chosen for this scenario. Example file content is below: <?xml version="1.0" encoding="UTF-8"?> <SSNNDataFile> <TrainingResults Solved="false" TrainingDuration="0.0" ChunksProduced="-1" ChunkSplits="0" HalfSplits="0" SizeFailures="0" TrainingIterations="0"> <TrainingParameters SplitMode="CHUNK SPLIT" SplitAlgorithm="Trained Region" TrainingAlgorithm="Back Propagation" MaxRuns="300000" MinChunkSize="3" RandomSeed="0" SuccessCriteria="0.4" LearningRate="0.3" Momentum="0.9" fitness="Largest Chunk" /> </TrainingResults> <TestingResults GeneralizationRate="-1.0"> <TestingParameters CorrectnessCriteria="0.4" /> </TestingResults> <VisualizationOptions ShowBackgroundImage="false" ShowAxes="true" ShowColors="true" ShowPoints="true" /> </SSNNDataFile> Figure 15. Sample XML Results file. 38 3. Steps to add a new training algorithm The best approach is to use an existing training algorithm as a template while using the following steps to ensure a particular function is not skipped. 1. Add the new training algorithm in a .java file in the edu.csus.ecs.ssnn.nn.trainer package. It must implement TrainingAlgorithmInterface. It is recommended to extend the TrainingAlgorithm base class. The instructions assume this has been done. 2. Update edu.csus.ecs.ssnn.data.TrainingParameters.java: a. Add a new entry to the AlgorithmType_Map hash map. This allows the algorithm to show up in the GUI drop-down for training algorithms. It is also used to identify the algorithm in any saved/loaded scenario files and log files. b. Add a new public variable for MyAlgorithm.Params. c. Update TrainingParameters() to include to initialize the new params object. d. Update the switch statement in getTrainingAlgorithmObject() to return a new instance of the training algorithm. 3. Update edu.csus.ecs.ssnn.fileio.SSNNDataFile.java: a. In the save() function update the switch statement on training algorithms to set XML strings corresponding to each Params element. b. In the loadFrom() function update the switch statement on training algorithms to set the Params object from the XML strings defined in save(). 4. Update edu.csus.ecs.ssnn.ui.TrainingParametersDialog.java a. Add a new card (JPanel) to algorithmPanel.jPanel_AlgorithmOptions. This card will contain any settable parameters specific to the algorithm. b. Make sure to set the Card Name (under Properties) to the same string used in TrainingParameters.AlgorithmType_Map. This allows the auto-populated 39 combobox to correctly display the card. c. Add a new help string for the algorithm. d. Wire up the MouseEnter properties for the card to display the help string. e. Update getTrainingParameters() to set the Params object from the custom UI elements defined on the card. f. Update presetControls() to set the custom UI elements defined on the card from the Params object. 5. Steps to add a new splitting algorithm Adding a new splitting algorithm follows a similar approach to training algorithms, with a few steps removed. This is because the splitting algorithms do not take any extra parameters that must be accounted for. Note - this may change for future splitting algorithms, which would require refactoring some areas of code, most likely to add a similar parameter passing system to that used for the training algorithms. 1. Add the new splitting algorithm in a .java file in the edu.csus.ecs.ssnn.nn.splittrainer package. It must implement SSNNTrainingAlgorithmInterface. It is recommended to extend the SSNNTrainingAlgorithm base class. The instructions assume this has been done. 2. Update edu.csus.ecs.ssnn.data.TrainingParameters.java: a. Add a new entry to the SplitAlgorithmType_Map hash map. This allows the split algorithm to show up in the GUI drop-down for splitting algorithms. It is also used to identify the algorithm in any saved/loaded scenario files and log files. b. Update the switch statement in getSplitTrainingAlgorithmObject to return a new instance of the splitting algorithm (with appropriate options set). 40 6. Steps to add a new fitness function Fitness functions follow a similar flow to the other algorithms. Because they are not used by all training functions (such as backpropagation), they are considered as one of the custom parameters that can be set by individual training algorithms. Currently, fitness functions are not being saved or loaded by scenario files or reported in log files. This is primarily a workaround to the inability to pass function references in Java, and could be addressed with future enhancements. Adding a new fitness function requires the following steps. 1. Add the new fitness algorithm in a .java file in the edu.csus.ecs.ssnn.nn.trainer.fitness package. It must implement FitnessInterface. 2. Update edu.csus.ecs.ssnn.data.TrainingParameters.java by adding a new entry to FitnessFunction_Map hash map. Note that this approach differs from the training algorithms in that an instance of the fitness function is created. 41 APPENDIX B Full source code is too long to provide in this paper. Selected source files are available below. The complete source can be downloaded from the Mercurial repository at http://www.digitalxen.net/school/project/. Source Listing Underline files have included source code below. edu.csus.ecs.ssnn.data.ScalingParameters.java edu.csus.ecs.ssnn.data.ScenarioProperties.java edu.csus.ecs.ssnn.data.TestingParameters.java edu.csus.ecs.ssnn.data.TestingResults.java edu.csus.ecs.ssnn.data.TrainingParameters.java edu.csus.ecs.ssnn.data.TrainingResults.java edu.csus.ecs.ssnn.data.VisualizationOptions.java edu.csus.ecs.ssnn.data.XMLTreeModel.java edu.csus.ecs.ssnn.event.NetworkConvergedEvent.java edu.csus.ecs.ssnn.event.NetworkSplitEvent.java edu.csus.ecs.ssnn.event.NetworkTrainerEventListenerInterface.java edu.csus.ecs.ssnn.event.NetworkTrainingEvent.java edu.csus.ecs.ssnn.event.SplitTrainerEventListenerInterface.java edu.csus.ecs.ssnn.event.TestingCompletedEvent.java edu.csus.ecs.ssnn.event.TestingProgressEventListenerInterface.java edu.csus.ecs.ssnn.event.TrainingCompletedEvent.java edu.csus.ecs.ssnn.event.TrainingProgressEvent.java edu.csus.ecs.ssnn.event.TrainingProgressEventListenerInterface.java edu.csus.ecs.ssnn.event.fileio.SSNNDataFile.java edu.csus.ecs.ssnn.event.fileio.SSNNLogging.java edu.csus.ecs.ssnn.nn.DataFormatException.java edu.csus.ecs.ssnn.nn.DataPair.java edu.csus.ecs.ssnn.nn.DataRegion.java edu.csus.ecs.ssnn.nn.DataSet.java edu.csus.ecs.ssnn.nn.IncompatibleDataException.java edu.csus.ecs.ssnn.nn.ModularNeuralNetwork.java edu.csus.ecs.ssnn.nn.NNRandom.java edu.csus.ecs.ssnn.nn.NetworkLayer.java edu.csus.ecs.ssnn.nn.NetworkNode.java edu.csus.ecs.ssnn.nn.NeuralNetwork.java edu.csus.ecs.ssnn.nn.NeuralNetworkException.java edu.csus.ecs.ssnn.nn.trainer.BackPropagation.java edu.csus.ecs.ssnn.nn.trainer.GeneticAlgorithm.java edu.csus.ecs.ssnn.nn.trainer.NeighborAnnealing.java edu.csus.ecs.ssnn.nn.trainer.ParticleSwarmOptimization.java edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithm.java edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface.java edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface.java edu.csus.ecs.ssnn.nn.trainer.fitness.LargestChunk.java edu.csus.ecs.ssnn.nn.trainer.fitness.SimpleUnsolvedPoints.java 42 edu.csus.ecs.ssnn.splittrainer.AreaBasedBinarySplitTrainer.java edu.csus.ecs.ssnn.splittrainer.NoSplitTrainer.java edu.csus.ecs.ssnn.splittrainer.RandomSplitterTrainer.java edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithm.java edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithmInterface.java edu.csus.ecs.ssnn.splittrainer.TrainedResultsSplitTrainer.java edu.csus.ecs.ssnn.ui.CreateSSNNDialog.java edu.csus.ecs.ssnn.ui.HelpDialog.java edu.csus.ecs.ssnn.ui.LoadDataSetDialog.java edu.csus.ecs.ssnn.ui.LoadNetworkFromFileDialog.java edu.csus.ecs.ssnn.ui.LoadPredefinedScenarioDialog.java edu.csus.ecs.ssnn.ui.LoadSavedOptionsDialog.java edu.csus.ecs.ssnn.ui.LoadTestingSetDialog.java edu.csus.ecs.ssnn.ui.LoadTrainingSetDialog.java edu.csus.ecs.ssnn.ui.Log.java edu.csus.ecs.ssnn.ui.LogSettingsDialog.java edu.csus.ecs.ssnn.ui.MainGUI.java edu.csus.ecs.ssnn.ui.SSNNVisPanel.java edu.csus.ecs.ssnn.ui.SaveToFileDialog.java edu.csus.ecs.ssnn.ui.ScalingParameters.java edu.csus.ecs.ssnn.ui.TestingParametersDialog.java edu.csus.ecs.ssnn.ui.TestingResultsDialog.java edu.csus.ecs.ssnn.ui.TrainingParametersDialog.java edu.csus.ecs.ssnn.ui.TrainingResultsDialog.java edu.csus.ecs.ssnn.ui.dialoghelpers.CSVFileFilter.java edu.csus.ecs.ssnn.ui.dialoghelpers.FilePreview.java edu.csus.ecs.ssnn.ui.dialoghelpers.XMLFileFilter.java Selected Source Code /* =================================================================== edu.csus.ecs.ssnn.data.TrainingParameters.java =================================================================== */ package edu.csus.ecs.ssnn.data; import import import import import java.util.*; edu.csus.ecs.ssnn.nn.DataFormatException; edu.csus.ecs.ssnn.nn.trainer.*; edu.csus.ecs.ssnn.nn.trainer.fitness.*; edu.csus.ecs.ssnn.splittrainer.*; public class TrainingParameters { public static enum SplitType { CENTROID_SPLIT, CHUNK_SPLIT }; public static enum SplitAlgorithmType { NO_SPLIT, TRAINED_REGION, AREA_BASED_BINARY_SPLIT } // shortcut for connecting string to AlgorithmType enum. // new algorithms need to be added to AlgorithmType and this map public static final Map<String, SplitAlgorithmType> SplitAlgorithmType_Map = new HashMap<String, SplitAlgorithmType>() {{ 43 put("Area Based Binary Split", SplitAlgorithmType.AREA_BASED_BINARY_SPLIT); put("Trained Region", SplitAlgorithmType.TRAINED_REGION); put("No Split", SplitAlgorithmType.NO_SPLIT); }}; public static enum AlgorithmType { BACK_PROPAGATION, GENETIC_ALGORITHM, PARTICLE_SWARM_OPTIMIZATION, NEIGHBOR_ANNEALING } // shortcut for connecting string to AlgorithmType enum. // new algorithms need to be added to AlgorithmType and this map public static final Map<String, AlgorithmType> AlgorithmType_Map = new HashMap<String, AlgorithmType>() {{ put("Back Propagation", AlgorithmType.BACK_PROPAGATION); put("Genetic Algorithm", AlgorithmType.GENETIC_ALGORITHM); put("Particle Swarm Optimization", AlgorithmType.PARTICLE_SWARM_OPTIMIZATION); put("Neighbor Annealing", AlgorithmType.NEIGHBOR_ANNEALING); }}; public public public public BackPropagation.Params BackPropagation_Params; GeneticAlgorithm.Params GeneticAlgorithm_Params; NeighborAnnealing.Params NeighborAnnealing_Params; ParticleSwarmOptimization.Params ParticleSwarmOptimization_Params; // shortcut for connecting string to FitnessFunction enum. // new algorithms need to be added to FitnessFunction and this map public static final Map<String, FitnessInterface> FitnessFunction_Map = new HashMap<String, FitnessInterface>() {{ put("Unsolved Points", new SimpleUnsolvedPoints()); put("Largest Chunk", new LargestChunk()); }}; // splitting algorithm for generating subnetworks within the modular NN private SplitAlgorithmType splitAlgorithmType; // Type of splitting to use. private SplitType splitMode; // Algorithm to use to find weight values for neural network private AlgorithmType algorithmType; // The maximum number of runs allowed before training fails as unsolved. private int maxRuns; // The minimum neural network size allowed from any split. private int minChunkSize; // Seed used in the random number generator. private int randomSeed; // How close to the goal a value has to be to be considered correct. private double successCriteria; 44 /** * Creates a new instance of TrainingParameters setting all * parameters to default values. */ public TrainingParameters() { this(SplitType.CHUNK_SPLIT, SplitAlgorithmType.TRAINED_REGION, AlgorithmType.BACK_PROPAGATION, 0, 0, 0, 0.0); } public TrainingParameters( SplitType splitMode, SplitAlgorithmType splitAlgorithm, AlgorithmType algorithm, int maxRuns, int minChunkSize, int randomSeed, double successCriteria) { this.BackPropagation_Params = new BackPropagation.Params(); this.GeneticAlgorithm_Params = new GeneticAlgorithm.Params(); this.NeighborAnnealing_Params = new NeighborAnnealing.Params(); this.ParticleSwarmOptimization_Params = new ParticleSwarmOptimization.Params(); this.splitAlgorithmType = splitAlgorithm; this.splitMode = splitMode; this.algorithmType = algorithm; this.maxRuns = maxRuns; this.minChunkSize = minChunkSize; this.randomSeed = randomSeed; this.successCriteria = successCriteria; } public public public public public public public SplitAlgorithmType getSplitAlgorithmType() { return splitAlgorithmType; } AlgorithmType getAlgorithmType() { return algorithmType; } int getMaxRuns() { return maxRuns; } int getMinChunkSize() { return minChunkSize; } int getRandomSeed() { return randomSeed; } SplitType getSplitMode() { return splitMode; } double getSuccessCriteria() { return successCriteria; } public TrainingAlgorithmInterface getTrainingAlgorithmObject(int numOutputs) throws DataFormatException { double[] scaledCriteria = new double[numOutputs]; scaledCriteria[0] = successCriteria; switch (getAlgorithmType()) { case BACK_PROPAGATION: return new BackPropagation(BackPropagation_Params, maxRuns, scaledCriteria); case GENETIC_ALGORITHM: return new GeneticAlgorithm(GeneticAlgorithm_Params, maxRuns, scaledCriteria); case PARTICLE_SWARM_OPTIMIZATION: return new ParticleSwarmOptimization(ParticleSwarmOptimization_Params, maxRuns, scaledCriteria); case NEIGHBOR_ANNEALING: return new NeighborAnnealing(NeighborAnnealing_Params, maxRuns, scaledCriteria); default: 45 return null; } } public SSNNTrainingAlgorithmInterface getSplitTrainingAlgorithmObject(int numOutputs) throws DataFormatException { double[] scaledCriteria = new double[numOutputs]; scaledCriteria[0] = successCriteria; // if centroid split, set min chunk size to force only centroids int adjustedMinChunkSize; if (getSplitMode() == TrainingParameters.SplitType.CENTROID_SPLIT) { adjustedMinChunkSize = Integer.MAX_VALUE; } else { adjustedMinChunkSize = minChunkSize; } switch(getSplitAlgorithmType()) { case AREA_BASED_BINARY_SPLIT: AreaBasedBinarySplitTrainer area_trainer = new AreaBasedBinarySplitTrainer(); area_trainer.setMinChunkSize(adjustedMinChunkSize); area_trainer.setAcceptableErrors(scaledCriteria); return area_trainer; case TRAINED_REGION: TrainedResultsSplitTrainer region_trainer = new TrainedResultsSplitTrainer(); region_trainer.setMinChunkSize(adjustedMinChunkSize); region_trainer.setAcceptableErrors(scaledCriteria); return region_trainer; case NO_SPLIT: default: NoSplitTrainer none_trainer = new NoSplitTrainer(); return none_trainer; } } public void setSplitAlgorithmType(SplitAlgorithmType algorithm) { this.splitAlgorithmType = algorithm; } public void setAlgorithmType(AlgorithmType algorithm) { this.algorithmType = algorithm; } public void setMaxRuns(int maxRuns) { this.maxRuns = maxRuns; } public void setMinChunkSize(int minChunkSize) { this.minChunkSize = minChunkSize; } public void setRandomSeed(int randomSeed) { this.randomSeed = randomSeed; } public void setSplitMode(SplitType splitMode) { this.splitMode = splitMode; 46 } public void setSuccessCriteria(double successCriteria) { this.successCriteria = successCriteria; } } /* =================================================================== edu.csus.ecs.ssnn.event.NetworkSplitEvent.java =================================================================== */ package edu.csus.ecs.ssnn.event; import java.util.EventObject; import java.util.ArrayList; import java.util.List; import edu.csus.ecs.ssnn.nn.DataRegion; /** * This event is fired off whenever networks are split by the splitting algorithm. */ public class NetworkSplitEvent extends EventObject { /** * List of possible split types */ public enum SplitType { chunk, centroid } private double percentConverged of networks. private int chunkSize; private int networksConverged; private int networksInQueue; private int splitDimension; private int totalNetworks; = 0.0; // // // // // // Ratio of converged network to total number Size of the current network's chunk Number of networks that have converged. Number of networks waiting to be trained Dimension across which the split(s) happened Total number of networks in the modular network private SplitType splitType = SplitType.chunk; // List of region(s) that were solved (if any) private List<DataRegion> solvedRegions = new ArrayList<DataRegion>(); // List of regions that were not solved private List<DataRegion> unsolvedRegions = new ArrayList<DataRegion>(); // List of points at which the network was split private List<Double> splitPoints = new ArrayList<Double>(); // Creates a new instance of SplitHappened. /** * * @param source * Source that fired the event. * @param chunkSize * The size of the current network's chunk. * @param networksInQueue * Number of networks waiting to be trained. * @param networksConverged 47 * The number of networks that have converged. * @param splitDimension * Dimension across which the split(s) happened. * @param splitType * Type of split that occured. * @param totalNetworks * Total number of networks in the modular network. * @param percentConverged * The ratio of converged network to total number of networks. * @param unsolvedRegions * List of regions that were not solved. * @param solvedRegions * List of region(s) that were solved (if any). * @param splitPoints * List of points at which the network was split. */ public NetworkSplitEvent(Object source, int chunkSize, int networksInQueue, int networksConverged, int splitDimension, SplitType splitType, int totalNetworks, double percentConverged, List<DataRegion> unsolvedRegions, List<DataRegion> solvedRegions, List<Double> splitPoints) { super(source); this.chunkSize = chunkSize; this.networksInQueue = networksInQueue; this.networksConverged = networksConverged; this.splitDimension = splitDimension; this.splitType = splitType; this.totalNetworks = totalNetworks; this.percentConverged = percentConverged; this.unsolvedRegions = unsolvedRegions; this.solvedRegions = solvedRegions; this.splitPoints = splitPoints; } /** * * @return The size of the current network's chunk */ public int getChunkSize() { return (chunkSize); } /** * * @return Number of networks waiting to be trained */ public int getNetworksInQueue() { return networksInQueue; } /** * * @return The number of networks that have converged. */ public int getNetworksConverged() { return (networksConverged); } /** * * @return The ratio of converged network to total number of networks. */ 48 public double getPercentConverged() { return percentConverged; } /** * * @return List of region(s) that were solved (if any) */ public List<DataRegion> getSolvedRegions() { return solvedRegions; } /** * * @return Dimension across which the split(s) happened */ public int getSplitDimension() { return splitDimension; } /** * * @return List of points at which the network was split */ public List<Double> getSplitPoints() { return splitPoints; } /** * * @return Type of split that occurred */ public SplitType getSplitType() { return splitType; } /** * * @return Total number of networks in the modular network */ public int getTotalNetworks() { return totalNetworks; } /** * * @return List of regions that were not solved */ public List<DataRegion> getUnsolvedRegions() { return unsolvedRegions; } } /* =================================================================== edu.csus.ecs.ssnn.event.NetworkTrainerEventListenerInterface.java =================================================================== */ package edu.csus.ecs.ssnn.event; /** * Listener interface for training neural networks. * and completing network training. */ It provides for starting 49 public interface NetworkTrainerEventListenerInterface { /** * Handler for the training completed event. This method is called when the * modular neural network is done with the training set, either by successfully * solving all points, or by reaching a failure condition. * * @param e * event object. */ public void trainingCompleted(TrainingCompletedEvent e); /** * Handler for a network training event. This method is called when a new neural * network begins to train. * * @param e * event object. */ public void networkTraining(NetworkTrainingEvent e); } /* =================================================================== edu.csus.ecs.ssnn.nn.ModularNeuralNetwork.java =================================================================== */ package edu.csus.ecs.ssnn.nn; import java.util.ArrayList; import java.util.Iterator; import java.util.List; /** * Modular network that contains an internal list of networks and the regions * they solve. */ public class ModularNeuralNetwork implements Iterable<ModularNeuralNetwork.NetworkRecord> { private private private private private private solving ArrayList<NetworkRecord> networks; int numInputs; int numOutputs; Object trainingData; DataRegion inputRange; int lockedNetwork; // if set, will only use particular network for /** * Constructs an empty modular neural network with the given number of * inputs and outputs. * * @param inputs * Inputs to the network. * @param outputs * Outputs of the network. * @param inputRange * Total region final modular network is expected to cover. */ public ModularNeuralNetwork(int inputs, int outputs, DataRegion inputRange) { numInputs = inputs; 50 numOutputs = outputs; this.inputRange = inputRange; networks = new ArrayList<ModularNeuralNetwork.NetworkRecord>(); lockedNetwork = -1; } public public public public public int int int DataRegion Object getLockedNetwork() getNumInputs() getNumOutputs() getInputRange() getTrainingData() { { { { { return return return return return lockedNetwork; } numInputs; } numOutputs; } inputRange; } trainingData; } /** * @return number of solved networks in the modular neural network. */ public int getSolvedCount() { int solvedCount = 0; for(int i=0; i< networks.size(); i++) { if(networks.get(i).isSolved()) { solvedCount++; } } return solvedCount; } public void setTrainingData(Object o) { trainingData = o; } /** * Tells the modular neural network to only use the passed network for * solving points - regardless of the data region. * @param network_id * Id of network to use. */ public void setLockedNetwork(int network_id) { lockedNetwork = network_id; } /** * Adds a sub-network to the modular network which solves a given region. * * @param n * the new sub-network * @param r * the region the network is responsible for * @throws NeuralNetworkException * throws error when the number of network inputs don't match the region */ public void addNetwork(NeuralNetwork n, DataRegion r) throws NeuralNetworkException { if (n.getNumInputs() != r.getDimensions() || n.getNumInputs() != numInputs || r.getDimensions() != numInputs) { throw new NeuralNetworkException("Network has mismatching inputs\n" + "Network inputs: " + n.getNumInputs() + "\n" + "Region dimensions: " + r.getDimensions() + "\n" + "Modular Network inputs: " + numInputs); } networks.add(new NetworkRecord(n, r)); 51 } /** * overloaded version for setting solved status (useful for already solved split networks) * @param n * @param r * @param solved * @throws NeuralNetworkException */ public void addNetwork(NeuralNetwork n, DataRegion r, boolean solved) throws NeuralNetworkException { addNetwork(n, r); setSolvedStatus(networks.size()-1, solved); } /** * Returns a list of the NetworkRecords with DataRegions that contain a * point. * * @param pointCoords * coordinates of the point to find * @return a list of NetworkRecords containing DataRegions which contain the * given point */ private List<NetworkRecord> findRecordsAtPoint(List<Double> pointCoords) throws IncompatibleDataException { ArrayList<NetworkRecord> found = new ArrayList<NetworkRecord>(); for (int i = 0; i < networks.size(); i++) { if (networks.get(i).getDataRegion().containsPoint(pointCoords)) { found.add(networks.get(i)); } } return found; } /** * Gets a list of DataRegions that contain a given input point. * * @param pointCoords * an input point * @return a list of regions that contain the point * @throws IncompatibleDataException */ public List<DataRegion> getDataRegionsAtPoint(List<Double> pointCoords) throws IncompatibleDataException { List<NetworkRecord> matchingRecords = findRecordsAtPoint(pointCoords); List<DataRegion> matchingRegions = new ArrayList<DataRegion>(); for (NetworkRecord r : matchingRecords) { matchingRegions.add(r.getDataRegion()); } return matchingRegions; } /** * @param id * Index of network. * @return Data region associated with the specified network. 52 */ public DataRegion getDataRegion(int id) { return networks.get(id).getDataRegion(); } /** * @param id * Index of network. * @return Neural network associated with the specified network id. */ public NeuralNetwork getNetwork(int id) { return networks.get(id).getNeuralNetwork(); } /** * @return Count of all networks in the modular neural network. */ public int getNetworkCount() { return networks.size(); } /** * Gets a list of NeuralNetworks that are responsible for handling inputs at * the given input point. * * @param pointCoords * an input point * @return a list of networks that are responsible for providing outputs for * that point * @throws IncompatibleDataException */ public List<NeuralNetwork> getNetworksAtPoint(List<Double> pointCoords) throws IncompatibleDataException { List<NetworkRecord> matchingRecords = findRecordsAtPoint(pointCoords); List<NeuralNetwork> matchingNetworks = new ArrayList<NeuralNetwork>(); for (NetworkRecord r : matchingRecords) { matchingNetworks.add(r.getNeuralNetwork()); } return matchingNetworks; } /** * @param region * Region to search for. * @return Network associated with the specified data region. */ public NeuralNetwork getNetworkForRegion(DataRegion region) { for (NetworkRecord r : networks) { if (r.getDataRegion() == region) { return r.getNeuralNetwork(); } } return null; } /** * @param id * Index of network. 53 * @return Network record for associated id. */ public NetworkRecord getNetworkRecord(int id) { return networks.get(id); } /** * Returns a list of outputs for a given set of inputs. An appropriate list * of networks is selected to feed the inputs into, and the outputs from * these networks are averaged to produce the final outputs. * * @param inputs * the list of input values, which must match the * ModularNeuralNetwork's number of inputs * @return the list of output values * @throws NeuralNetworkException */ public List<Double> getOutputs(List<Double> inputs) throws NeuralNetworkException { if (inputs.size() != numInputs) { throw new NeuralNetworkException("Incorrect number of inputs to ModularNeuralNetwork" + "Input values: " + inputs.size() + ", network inputs: " + numInputs); } List<NeuralNetwork> matchingNetworks; // check for locked network. If locked, only use 1 network for solving if( lockedNetwork >= networks.size() ) { throw new NeuralNetworkException("Locked network outside of networks range"); } else if(lockedNetwork >= 0) { List<NeuralNetwork> n = new ArrayList<NeuralNetwork>(); n.add(networks.get(lockedNetwork).getNeuralNetwork()); matchingNetworks = n; } else { matchingNetworks = getNetworksAtPoint(inputs); } ArrayList<Double> finalOutputs = new ArrayList<Double>(); double sums[] = new double[numOutputs]; for (int i = 0; i < numOutputs; i++) { sums[i] = 0; } for (NeuralNetwork n : matchingNetworks) { List<Double> outputs = n.getOutputs(inputs); for (int i = 0; i < numOutputs; i++) { sums[i] += outputs.get(i); } } for (int i = 0; i < numOutputs; i++) { finalOutputs.add(sums[i] / matchingNetworks.size()); } return finalOutputs; } /** * Gets the region that a given network can solve. * 54 * @param network * the network to find the region for. * @return the region solved by the network */ public DataRegion getRegionForNetwork(NeuralNetwork network) { for (NetworkRecord r : networks) { if (r.getNeuralNetwork() == network) { return r.getDataRegion(); } } return null; } /** * @param id * Index of network. * @return * Whether specified network has been solved. */ public boolean getSolvedStatus(int id) { return networks.get(id).isSolved(); } /** * @return Iterator to list of NetowrkRecords that hold the network and data region. */ public synchronized Iterator<NetworkRecord> iterator() { return networks.iterator(); } /** * Removes a sub-network from the modular network and the associated region. * If there is no exactly matching sub-network in the modular network, * nothing is removed. * * @param network * the network to remove */ public void removeNetwork(NeuralNetwork network) { NetworkRecord removeMe = null; for (NetworkRecord r : networks) { if (r.getNeuralNetwork() == network) { removeMe = r; break; } } if (removeMe != null) { networks.remove(removeMe); } } /** * Removes a region from the modular network and the associated sub-network. * If there is no exactly matching region in the modular network, nothing is * removed. * * @param region * the region to remove */ public void removeRegion(DataRegion region) { 55 NetworkRecord removeMe = null; for (NetworkRecord r : networks) { if (r.getDataRegion() == region) { removeMe = r; break; } } if (removeMe != null) { networks.remove(removeMe); } } /** * * @param id * Index of network * @param solved * Network's solved status */ public void setSolvedStatus(int id, boolean solved) { networks.get(id).setSolved(solved); } /** * A NetworkRecord associates a NeuralNetwork with a DataRegion. The network * in the record is the one that can give correct outputs for inputs in the * associated region of the input domain. */ public class NetworkRecord { private NeuralNetwork n; private DataRegion r; // region of the input domain associated with neural network private DataSet solvedPoints; // all points (including outside of domain) that are solved private DataSet unsolvedPoints; // all points (including outside of domain) that can't be solved private int[] grayscalePoints; // holds grayscale calculations for reuse private boolean solved; private int trainingSetSize; // saves # of points in training set /** * Creates a new empty NetworkRecord */ public NetworkRecord() { n = null; r = null; solvedPoints = null; solved = false; trainingSetSize = -1; } /** * * @param n * Neural network to add to record. * @param r * Data region to add to record. * @throws NeuralNetworkException */ public NetworkRecord(NeuralNetwork n, DataRegion r) throws NeuralNetworkException { 56 this(); setRecordData(n, r); } public public public public public public public NeuralNetwork DataRegion DataSet DataSet int int[] boolean getNeuralNetwork() getDataRegion() getSolvedDataPoints() getUnsolvedDataPoints() getTrainingSetSize() getGrayscalePoints() isSolved() { { { { { { { return return return return return return return n; } r; } solvedPoints; } unsolvedPoints; } trainingSetSize; } grayscalePoints; } solved; } public void setSolved(boolean solved) { this.solved = solved; } public void setSolvedDataPoints(DataSet points) { this.solvedPoints = points; } public void setUnsolvedDataPoints(DataSet points) { this.unsolvedPoints = points; } public void setTrainingSetSize(int size) { this.trainingSetSize = size; } public void setGrayscalePoints(int[] points) { this.grayscalePoints = points; } /** * Associates network n with region r * * @param n * the neural network that handles inputs in region r * @param r * the region of the input domain that the neural network can solve. * @throws NeuralNetworkException * Throws error if number of network inputs don't match the data region. */ public void setRecordData(NeuralNetwork n, DataRegion r) throws NeuralNetworkException { if(n.getNumInputs() != r.getDimensions()) { throw new NeuralNetworkException("Network inputs do not match region dimensions" + "Network inputs: " + n.getNumInputs() + "\n" + "Region dimensions: " + r.getDimensions()); } this.n = n; this.r = r; } } } /* =================================================================== edu.csus.ecs.ssnn.nn.NetworkLayer.java =================================================================== */ package edu.csus.ecs.ssnn.nn; 57 import java.util.ArrayList; import java.util.Iterator; import java.util.List; /** * A layer of the network, containing individual nodes. */ public class NetworkLayer implements Iterable<NetworkNode> { // List of nodes in the layer. This does not include a bias node. private ArrayList<NetworkNode> nodes; // Helper for training algorithms. This field is not used by the instance and // serves only as a place for a training algorithm to store data about the instance. private Object trainingData; /** * Constructs a new NetworkLayer containing the given number of cells, all * of which have previousLayerSize number of weights. All of the weights are * initialized to zero. To set initial random weights, call * randomizeWeights. * * @param cells * the number of cells in this layer * @param previousLayerSize * the number of cells in the previous layer */ public NetworkLayer(int cells, int previousLayerSize) { nodes = new ArrayList<NetworkNode>(); for (int i = 0; i < cells; i++) { nodes.add(new NetworkNode(previousLayerSize)); } } /** * Assigns small random values to each of the layer's nodes' input weights. * * @param weightMin * Minimum allowed weight value for the network. * @param weightMax * Maximum allowed weight value for the network. * @param percOfSpace * Allowed initial weight distance from midpoint. * @see NetworkNode.randomizeWeights() * @see NeuralNetwork.randomizeWeights() */ public void randomizeWeights(double weightMin, double weightMax, double percOfSpace) { for (NetworkNode n : nodes) { n.randomizeWeights(weightMin, weightMax, percOfSpace); } } public Object getTrainingData() { return trainingData; } public void setTrainingData(Object trainingData) { this.trainingData = trainingData; } 58 /** * Gets a particular node inside the layer. * * @param node * the number of the node to get * @return the requested network node */ public NetworkNode getNode(int node) { return nodes.get(node); } /** * Gets an Iterator over the nodes in the layer. * * @see java.lang.Iterable#iterator() */ public Iterator<NetworkNode> iterator() { return nodes.iterator(); } /** * @return number of nodes in the layer. */ public int getNodeCount() { return nodes.size(); } /** * Calculates list of output values for the layer given a set of inputs. * * @param inputs * Inputs to the layer - must match number of weights to layer. * @return Calculated outputs from the layer. * @throws NeuralNetworkException */ public List<Double> getOutputs(List<Double> inputs) throws NeuralNetworkException { ArrayList<Double> layerOutputs = new ArrayList<Double>(); for (NetworkNode n : nodes) { layerOutputs.add(n.getOutput(inputs)); } return layerOutputs; } /** * @return list of weights pulled from all nodes in the layer. */ public List<Double> getWeights() { List<Double> weights = new ArrayList<Double>(); for(NetworkNode node: nodes) { weights.addAll(node.getWeights()); } return weights; } } /* =================================================================== edu.csus.ecs.ssnn.nn.NetworkNode.java 59 =================================================================== */ package edu.csus.ecs.ssnn.nn; import import import import java.util.ArrayList; java.util.Iterator; java.util.List; java.util.Random; /** * An individual node in the neural network, which is primarily defined * by the list of weights connecting it to the bias node and and nodes in the * previous layer. * */ public class NetworkNode implements Iterable<Double> { // List of weights of nodes in the previous network layer. private ArrayList<Double> inWeights; // Weight of the bias node. private double biasWeight; // Helper for training algorithms. This field is not used by the instance and // serves only as a place for a training algorithm to store data about the instance. private Object trainingData; // The last computed output value from the NetworkNode. // Useful to training algorithms. private double lastOutput; /** * Constructs a NetworkNode with a supplied number of input weights, all * initialized to zero. Call randomizeWeights to assign small random values * to the weights. * * @param inputNodes * the number of inputs there are to this node */ public NetworkNode(int inputNodes) { inWeights = new ArrayList<Double>(); for (int i = 0; i < inputNodes; i++) { inWeights.add(0.0); } biasWeight = 0; trainingData = null; lastOutput = 0; } /** * Assigns random values to each of the node's input weights. The values * are scaled to fall around the midpoint of [weightMin, weightMax], using * scaleRatio to determine how far away from the midpoint is allowed. * * @param weightMin * Minimum weight value. * @param weightMax * Maximum weight value. * @param percOfSpace 60 * How far away from midpoint of weights new random value is allowed. * (0.1 = 10%) * @see NetworkLayer.randomizeWeights() * @see NeuralNetwork.randomizeWeights() */ public void randomizeWeights(double weightMin, double weightMax, double percOfSpace) { Random r = NNRandom.getRandom(); double weightDelta = (weightMax - weightMin) * percOfSpace; double scaledMin = (weightMax - weightMin)/2.0 + weightMin - weightDelta/2.0; for (int i = 0; i < inWeights.size(); i++) { double weight = r.nextDouble() * weightDelta + scaledMin; inWeights.set(i, weight); } biasWeight = r.nextDouble() * weightDelta + scaledMin; } // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> public double getBiasWeight() { return biasWeight; public double getLastOutput() { return lastOutput; public Object getTrainingData() { return trainingData; } } } public void setBiasWeight(double newWeight) { biasWeight = newWeight; } public void setTrainingData(Object trainingData) { this.trainingData = trainingData; } // </editor-fold> /** * Set a particular weight value on the node. * * @param weightId * Zero-based weight index. * @param newWeight * Weight value to set. */ public void setWeight(int weightId, double newWeight) { inWeights.set(weightId, newWeight); } /** * Gets a particular weight value from the node. * @param weightId * Zero-based weight index. * @return Weight value */ public double getWeight(int weightId) { return inWeights.get(weightId); } /** * @return List of all weights on the node, including bias weight (at the end). */ public List<Double> getWeights() { 61 // include bias node List<Double> weights = (List<Double>)inWeights.clone(); weights.add(biasWeight); return weights; } /** * Gets the number of weight values this node is tracking. The getOutput * methods need exactly as many inputs as the value returned by this method. * @return count of all weight values for the node, including bias node. */ public int getWeightCount() { return inWeights.size() + 1; // accounting for bias node } /** * * @return Iterator over the node's weights. */ public Iterator<Double> iterator() { return inWeights.iterator(); } /** * Computes the output of the node based on the given inputs and the stored * weights. * * @param inputValues * a list of input values to the node, one for each weight * @return the output value of the node * @throws NeuralNetworkException */ public double getOutput(List<Double> inputValues) throws NeuralNetworkException { if (inputValues.size() != inWeights.size()) { throw new NeuralNetworkException("Number of inputs don't match nodes." + "Input values: " + inputValues.size() + ", weights: " + inWeights.size()); } return computeOutput(inputValues); } /** * Helper to the getOutput methods which does the computation and returns * the output value. * * @param inputs * a list of doubles representing the inputs to the node * @return the computed output value */ private double computeOutput(List<Double> inputs) { double sum = 0; int i = 0; for (double in : inputs) { sum += in * inWeights.get(i); i++; } sum += biasWeight; // Bias node which always outputs 1. double logisticOutput = 1.0 / (1.0 + Math.exp(-sum)); lastOutput = logisticOutput; return logisticOutput; 62 } } /* =================================================================== edu.csus.ecs.ssnn.nn.NeuralNetwork.java =================================================================== */ package edu.csus.ecs.ssnn.nn; import java.util.ArrayList; import java.util.Iterator; import java.util.List; /** * Defines the structure of the neural network, including all layers and nodes * and the weights between them. Also handles all necessary input scaling. */ public class NeuralNetwork implements Iterable<NetworkLayer> { // A list of the layers in the network. This includes the hidden layers and // output layer, but no input layer. private ArrayList<NetworkLayer> layers; // The number of inputs this neural network can accept. private int numInputs; // The number of weights in this neural network. // Calculated when network is built. private int numWeights; // Helper for training algorithms. This field is not used by the instance and // serves only as a place for a training algorithm to store data about the instance. private Object trainingData; // max/min weight values can be customized. private double weightMax; private double weightMin; Available to training algorithms // store mins/maxes for all points in set private DataRegion _scalingInputRegion; private ArrayList<Double> _inLocalMins; private ArrayList<Double> _inGlobalMins; private ArrayList<Double> _inSlopes; /** * Constructs a new neural network with the given topology, with all * connections between nodes having a default weight of 0. To initialize the * weights with random values, call randomizeWeights. * * @param inputs * the number of inputs to the network * @param outputs * the number of outputs from the network * @param HiddenTopology * a list of node counts for each hidden layer. Can be null or * empty for no hidden layers. */ public NeuralNetwork(int inputs, int outputs, List<Integer> HiddenTopology) { layers = new ArrayList<NetworkLayer>(); numInputs = inputs; weightMax = 20.0; weightMin = -20.0; 63 // // // // An input layer isn't needed because we never have to actually calculate anything using it. When the network does a computation, the inputs will be supplied to the first hidden layer (or to the output layer if there are no hidden layers). int lastLayerSize = inputs; // Hidden layers if (HiddenTopology != null) { for (Integer i : HiddenTopology) { layers.add(new NetworkLayer(i, lastLayerSize)); numWeights += i*(lastLayerSize+1); // weights between last layer and current (accounting for bias node) lastLayerSize = i; } } // Output layer layers.add(new NetworkLayer(outputs, lastLayerSize)); numWeights += outputs*(lastLayerSize+1); } // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> public int getNumInputs() { return numInputs; } public int getNumWeights() { return numWeights; } public Object getTrainingData() { return trainingData; } public double getWeightMax() { return weightMax; } public double getWeightMin() { return weightMin; } public DataRegion getScalingRegion() { return _scalingInputRegion; } public void setTrainingData(Object trainingData) { this.trainingData = trainingData; } public void setWeightMax(double weightMax) { this.weightMax = weightMax; } public void setWeightMin(double weightMin) { this.weightMin = weightMin; } // </editor-fold> /** * Initializes all the weights in the network to random values between * the min/max allowed weight values. * * @see NetworkLayer.randomizeWeights() * @see NetworkNode.randomizeWeights() */ public void randomizeWeights(double percOfSpace) { for (NetworkLayer l : layers) { l.randomizeWeights(weightMin, weightMax, percOfSpace); } } /** * Generates outputs from the neural network given a list of inputs. * * @param inputs * the input values to the network * @return a list of outputs from the network 64 * @throws NeuralNetworkException on incorrect number of inputs to the network. */ public List<Double> getOutputs(List<Double> inputs) throws NeuralNetworkException { // Make sure there's a correct number of inputs before beginning. if (inputs.size() != numInputs) { throw new NeuralNetworkException("Incorrect number of inputs to network."); } // scale inputs List<Double> nextLayerInputs = scale(inputs, _inLocalMins, _inGlobalMins, _inSlopes); for (NetworkLayer l : layers) { nextLayerInputs = l.getOutputs(nextLayerInputs); } // scale outputs List<Double> outputs = nextLayerInputs; return outputs; } /** * Scales the point using the network specific scaling factors. * * @param values * The values that make up the data point. * @param domainMins * Minimum domain values. Usually the training set mins. * @param rangeMins * Minimum range values. Usually the minimum of the data region * covered by the network. * @param slopes * Previously calculated slopes of of delta(range)/delta(domain) * @return * Scaled point * @throws NeuralNetworkException on mismatched dimensions. */ protected static List<Double> scale(List<Double> values, List<Double> domainMins, List<Double> rangeMins, List<Double> slopes) throws NeuralNetworkException { if( (values.size() != domainMins.size()) || (domainMins.size() != rangeMins.size()) ) { throw new NeuralNetworkException("Unable to scale points - dimension mismatch"); } List<Double> scaled = new ArrayList<Double>(); for(int i=0; i < values.size(); i++) { scaled.add( (values.get(i) - domainMins.get(i)) * slopes.get(i) + rangeMins.get(i) ); } return scaled; } /** * * @param layer * Zero-based index to layer in the neural network. */ 65 public NetworkLayer getLayer(int layer) { return layers.get(layer); } /** * Gets the number of layers in the network, including the hidden and output * layers but NOT an input layer. * * @return the number of hidden and output layers in the network */ public int getLayerCount() { return this.layers.size(); } public List<Double> getWeights() { List<Double> weights = new ArrayList<Double>(); for(NetworkLayer layer: layers) { weights.addAll(layer.getWeights()); } return weights; } /** * @return an Iterator over the network layers. */ public Iterator<NetworkLayer> iterator() { return layers.iterator(); } /** * @return count of outputs to the Neural Network (i.e. nodes in last layer). */ public int getNumOutputs() { return layers.get(layers.size()-1).getNodeCount(); } /** * @return node count of each hidden layer */ public List<Integer> getHiddenTopology() { ArrayList<Integer> topology = new ArrayList<Integer>(); // stopping before last layer to avoid including output layer for(int i=0; i<layers.size()-1; i++) { topology.add(layers.get(i).getNodeCount()); } return topology; } /** * Sets input domain scaling for network * Saves scaling factor: (Range Max - Range Min) / (Domain Max - Domain Min) * and min value for each input dimension. * * @param local * DataRegion for sub network inputs * @param global * DataRegion for entire modular neural network inputs * @throws IncompatibleDataException on dimension mismatch. */ public void setInputScaling(DataRegion local, DataRegion global) 66 throws IncompatibleDataException { _scalingInputRegion = local; _inLocalMins = new ArrayList<Double>(); _inGlobalMins = new ArrayList<Double>(); _inSlopes = new ArrayList<Double>(); // want to scale from local back up to global setScaling(_inLocalMins, _inGlobalMins, _inSlopes, local, global); } /** * Determines scaling factor (Range Max - Range Min) / (Domain Max - Domain Min) * for each dimension, and saves the dimension minimums required to do scaling. * * @param domainMins * Array to save domain mins to. * @param rangeMins * Array to save range mins to. * @param slopes * Array to save slopes (scaling factors) to. * @param domain * Domain to calculate scaling from. * @param range * Range to calculate scaling from. * * @throws IncompatibleDataException on dimension mismatch. */ protected static void setScaling(ArrayList<Double> domainMins, ArrayList<Double> rangeMins, ArrayList<Double> slopes, DataRegion domain, DataRegion range) throws IncompatibleDataException { // check dimensions if( domain.getDimensions() != range.getDimensions() ) { throw new IncompatibleDataException("Dimensions for network scaling don't match"); } for(int i=0; i < domain.getDimensions(); i++) { domainMins.add(i, domain.getMin(i)); rangeMins.add(i, range.getMin(i)); slopes.add(i, (range.getMax(i) - range.getMin(i)) / (domain.getMax(i) domain.getMin(i))); } } /** * Loads array of weights into the neural network * * @param weights * Array of weights (must match network node weight count) */ public void setWeights(Double[] weights) { // set weights on nodes in first and hidden layers int p_index = -1; int prevLayerCount = getNumInputs(); for (int layerNum = 0; layerNum < getLayerCount(); layerNum++) { NetworkLayer currLayer = getLayer(layerNum); // For each node in the layer: for (int currNodeNum = 0; currNodeNum < currLayer.getNodeCount(); currNodeNum++) { NetworkNode currNode = currLayer.getNode(currNodeNum); currNode.setBiasWeight(weights[++p_index]); 67 // For each weight from curr node to last layer nodes for (int lastNodeNum = 0; lastNodeNum < prevLayerCount; lastNodeNum++) { currNode.setWeight(lastNodeNum, weights[++p_index]); } } prevLayerCount = currLayer.getNodeCount(); } } } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.BackPropagation.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer; import edu.csus.ecs.ssnn.nn.*; import java.util.ArrayList; import java.util.List; /** * This class implements the backpropagation training algorithm for single * neural networks. */ public class BackPropagation extends TrainingAlgorithm { public static class Params { protected double learningRate; protected double momentum; // Back propogation learning rate // Back propogation momentum. 0 disables. // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> public double public double getLearningRate() getMomentum() { return this.learningRate; { return this.momentum; } } public void setLearningRate(double learningRate) { this.learningRate = learningRate; } public void setMomentum(double momentum) { this.momentum = momentum; } // </editor-fold> } protected Params params; public BackPropagation(Params params, int maxIterations, double[] acceptableError) { super(maxIterations, acceptableError); this.params = params; } public boolean train(NeuralNetwork n, DataSet d) throws NeuralNetworkException { VerifyInputs(n, d); // Put training data object in each node: 68 int inputCount = n.getNumInputs(); for (NetworkLayer l : n) { for (NetworkNode nn : l) { nn.setTrainingData(new BackPropNodeData(inputCount)); } inputCount = l.getNodeCount(); } int correctInARow = 0; boolean converged = false; int dataIndex = 0; iterations = 0; // train until stop conditions met while (iterations < maxIterations && !converged) { // Reset errors on all network nodes: for (NetworkLayer l : n) { for (NetworkNode nn : l) { ((BackPropNodeData) nn.getTrainingData()).error = 0; } } List<Double> inputs = d.getPair(dataIndex).getInputs(); // Forward pass: List<Double> outs = n.getOutputs(inputs); // Compute output errors: double[] errors = new double[d.getNumOutputs()]; for (int i = 0; i < d.getNumOutputs(); i++) { // Error = desired output - actual output errors[i] = (d.getPair(dataIndex).getOutput(i) - outs.get(i)); } // See if all outputs are within acceptable error boolean correctOutput = true; for (int i = 0; i < errors.length; i++) { if (Math.abs(errors[i]) > acceptableError[i]) { correctOutput = false; break; } else { if(DEBUG) { System.out.println("Hit!"); System.out.flush(); } } } if (correctOutput) { correctInARow++; // must get correct output on all items of dataset if (correctInARow >= d.size()) { converged = true; continue; } } else { if(DEBUG) { if (correctInARow != 0) { System.out.println("Chain broke at " + correctInARow); } } correctInARow = 0; } 69 // Start of backward pass // Adjust output errors: NetworkLayer outputLayer = n.getLayer(n.getLayerCount() - 1); for (int i = 0; i < outputLayer.getNodeCount(); i++) { NetworkNode currentNode = outputLayer.getNode(i); BackPropNodeData currNodeData = (BackPropNodeData) currentNode.getTrainingData(); currNodeData.error = errors[i] * currentNode.getLastOutput() * (1 currentNode.getLastOutput()); } // Adjust hidden layer errors: // For each layer (going backwards from the last hidden layer): for (int layerNum = n.getLayerCount() - 2; layerNum >= 0; layerNum--) { NetworkLayer currLayer = n.getLayer(layerNum); NetworkLayer nextLayer = n.getLayer(layerNum + 1); // For each node in the current layer: for (int currLayerNodeNum = 0; currLayerNodeNum < currLayer.getNodeCount(); currLayerNodeNum++) { NetworkNode currentNode = currLayer.getNode(currLayerNodeNum); BackPropNodeData currNodeData = (BackPropNodeData) currentNode.getTrainingData(); // For each node in the next layer: for (int nextLayerNodeNum = 0; nextLayerNodeNum < nextLayer.getNodeCount(); nextLayerNodeNum++) { NetworkNode nextNode = nextLayer.getNode(nextLayerNodeNum); BackPropNodeData nextNodeData = (BackPropNodeData) nextNode.getTrainingData(); currNodeData.error += nextNode.getWeight(currLayerNodeNum) * nextNodeData.error * currentNode.getLastOutput() * (1 - currentNode.getLastOutput()); } } } // Start of forward weight adjustment // Adjust the first layer based on the inputs: NetworkLayer firstLayer = n.getLayer(0); for (int currNodeNum = 0; currNodeNum < firstLayer.getNodeCount(); currNodeNum++) { NetworkNode currNode = firstLayer.getNode(currNodeNum); BackPropNodeData currNodeData = (BackPropNodeData) currNode.getTrainingData(); for (int inputNum = 0; inputNum < inputs.size(); inputNum++) { // Update the weight from the input layer to this layer: double weightDelta = params.learningRate * inputs.get(inputNum) * currNodeData.error + currNodeData.lastWeightChanges.get(inputNum) * params.momentum; currNode.setWeight(inputNum, currNode.getWeight(inputNum) + weightDelta); // Update the "previous change" for momentum: currNodeData.lastWeightChanges.set(inputNum, weightDelta); } 70 // Update bias node weight too: double biasWeightDelta = params.learningRate * currNodeData.error + currNodeData.lastWeightChanges.get(currNodeData.lastWeightChanges.size() - 1) * params.momentum; currNode.setBiasWeight(currNode.getBiasWeight() + biasWeightDelta); currNodeData.lastWeightChanges.set( currNodeData.lastWeightChanges.size() - 1, biasWeightDelta); } // Adjust remaining layers based on previous layer's outputs: // For each layer: for (int layerNum = 1; layerNum < n.getLayerCount(); layerNum++) { NetworkLayer currLayer = n.getLayer(layerNum); NetworkLayer prevLayer = n.getLayer(layerNum - 1); // For each node in the layer: for (int currNodeNum = 0; currNodeNum < currLayer.getNodeCount(); currNodeNum++) { NetworkNode currNode = currLayer.getNode(currNodeNum); BackPropNodeData currNodeData = (BackPropNodeData) currNode.getTrainingData(); // For each node in the previous layer: for (int lastNodeNum = 0; lastNodeNum < prevLayer.getNodeCount(); lastNodeNum++) { NetworkNode prevNode = prevLayer.getNode(lastNodeNum); // Update weight between currNode and prevNode: double weightDelta = params.learningRate * prevNode.getLastOutput() * currNodeData.error + currNodeData.lastWeightChanges.get(lastNodeNum) * params.momentum; currNode.setWeight(lastNodeNum, currNode.getWeight(lastNodeNum) + weightDelta); // Update the "previous change" for momentum: currNodeData.lastWeightChanges.set(lastNodeNum, weightDelta); } // Update bias node weight too: double biasWeightDelta = params.learningRate * currNodeData.error + currNodeData.lastWeightChanges.get(currNodeData.lastWeightChanges.size() - 1) * params.momentum; currNode.setBiasWeight(currNode.getBiasWeight() + biasWeightDelta); currNodeData.lastWeightChanges.set( currNodeData.lastWeightChanges.size() - 1, biasWeightDelta); } } ReportProgress(); 71 dataIndex = ++dataIndex % d.size(); iterations++; } cleanNetwork(n); // Clean up training data return converged; } /** * BackPropNodeData objects are stored inside network nodes and are used to * carry information useful to the backpropagation training algorithm. */ private class BackPropNodeData { private double error = 0; // Computed output error of the node. // private double lastChange = 0; // List of previous single changes to each weight. Used for momentum. private ArrayList<Double> lastWeightChanges; private BackPropNodeData(int inputs) { lastWeightChanges = new ArrayList<Double>(); for (int i = 0; i < inputs; i++) { lastWeightChanges.add(0d); } lastWeightChanges.add(0d); // Last one is for bias node weight } } } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.GeneticAlgorithm.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer; import java.util.Arrays; import java.util.Random; import edu.csus.ecs.ssnn.nn.*; import edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface; /** * * Genetic Algorithm to replace backpropogation for training NN. * * An Individual consists of one set of data points that contain all the weights * for the network. * * Fitness function is determined by setting weights from individual values and * running through the training data comparing outputs. * 1st classification is # of entries outside of acceptable. */ public class GeneticAlgorithm extends TrainingAlgorithm { public static protected protected protected into genSize class Params { int genSize; int tournamentSize; int winnerCount; // number of individuals in generation. // # of individuals selected for a tournament // # of winners from a tournament. Must divide 72 protected protected protected protected double mutationChance; boolean elitism; crossoverType crossover; FitnessInterface fitnessFxn; // chosen fitness function // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> public boolean getElitism() { return this.elitism; } public int getGenerationSize() { return this.genSize; } public int getTournamentSize() { return this.tournamentSize; } public int getWinnerCount() { return this.winnerCount; } public double getMutationChance() { return this.mutationChance; } public crossoverType getCrossoverType() { return this.crossover; } public FitnessInterface getFitness() { return fitnessFxn; } public void setElitism(boolean elitism) { this.elitism = elitism; } public void setGenerationSize(int size) { this.genSize = size; } public void setTournamentSize(int size) { this.tournamentSize = size; } public void setWinnerCount(int count) { this.winnerCount = count; } public void setMutationChance(double chance) { this.mutationChance = chance; } public void setCrossoverType(crossoverType type) { this.crossover = type; } public void setFitness(FitnessInterface f) { this.fitnessFxn = f; } // </editor-fold> } protected Params params; public static enum crossoverType { ONE_POINT, TWO_POINT } private Individual[] pool; private int[] selected; protected Individual g; protected Individual p; // // // // holds individuals in the generation holds index of selected children for next gen global best best of previous generation // Constructor public GeneticAlgorithm(Params params, int maxIterations, double[] acceptableError) throws DataFormatException { super(maxIterations, acceptableError); this.params = params; if(params.tournamentSize < 2) { 73 throw new DataFormatException("Tournament size must be >= 2"); } if((params.genSize % params.winnerCount) != 0) { throw new DataFormatException("Winner count must divide into generation size"); } } // train network against dataset // Each particle defines a list of weights for the network // Particles are tested against all points in the dataset, then moved // to a new set of weights. // For fairness, iterations are incremented each time a single dataset pair tested public boolean train(NeuralNetwork n, DataSet d) throws IncompatibleDataException, NeuralNetworkException { VerifyInputs(n, d); iterations = 0; boolean converged = false; double weightMin = n.getWeightMin(); // min possible value for weight double weightMax = n.getWeightMax(); // max possible value for weight int weightCount = n.getNumWeights(); // includes weights for bias node this.pool = new Individual[params.genSize]; this.selected = new int[params.genSize]; int tmp_g=0; and setting this.g after loop int prev_i = 0; Object prev_fit; // reduces # of clones by point to index // generate initial population for(int i=0; i < params.genSize; i++) { iterations += d.size(); Individual ind = new Individual(weightCount, weightMin, weightMax); // set first particle to existing network weights // these will either be random, or the last set weights if(i == 0) { int cnt = -1; for(double weight: n.getWeights()) { ind.setPosition(++cnt, weight); } } ind.fitness = params.fitnessFxn.getFitness(ind.position, n, d, acceptableError); this.pool[i] = ind; // find first global best if(i ==0 || params.fitnessFxn.compare(ind.fitness, pool[tmp_g].fitness)) { tmp_g = i; // check for convergence if(params.fitnessFxn.hasConverged(pool[tmp_g].fitness)) { converged = true; break; } } } this.g = pool[tmp_g].clone(); prev_i = tmp_g; prev_fit = this.g.fitness; // setting best of current generation 74 // train! while(iterations < maxIterations && !converged) { iterations += params.genSize * d.size(); selection(); for(int i=0; i < params.genSize; i=i+2) { switch(params.crossover) { case ONE_POINT: crossover_onepoint(selected[i], selected[i+1], i, i+1); break; case TWO_POINT: crossover_twopoint(selected[i], selected[i+1], i, i+1); break; } } mutation(); // find best fit for previous generation for(int i=0; i < params.genSize; i++) { pool[i].fitness = params.fitnessFxn.getFitness(pool[i].position, n, d, acceptableError); // set initial prev_fit if( i==0 ) { prev_fit = pool[0].fitness; prev_i = 0; } // check for best fitness of generation if(params.fitnessFxn.compare(pool[i].fitness, prev_fit)) { prev_fit = pool[i].fitness; prev_i = i; } } if(params.elitism) { // copy global best string to 0th position pool[0].copy(this.g); } // check for new global best if(params.fitnessFxn.compare(prev_fit, this.g.fitness)) { this.g.copy(pool[prev_i]); // check for convergence if(params.fitnessFxn.hasConverged(this.g.fitness)) { converged = true; n.setWeights(this.g.position); // asign converged weights to network } } ReportProgress(); } cleanNetwork(n); return converged; } // Clean up training data 75 // chooses individual from generation using tournament selection // genSize/2 tournaments are run // for each tournament, tournament_size individuals are choosen // the best 2 move on to next generation private void selection() { Random rnd = NNRandom.getRandom(); int[] contenders = new int[params.tournamentSize]; // generate tournaments for(int i=0; i < params.genSize-params.winnerCount; i+=params.winnerCount) { // choose contenders for(int j=0; j < params.tournamentSize; j++) { contenders[j] = rnd.nextInt(params.genSize-1); } // fight! Arrays.sort(contenders); System.arraycopy(contenders, 0, selected, i, params.winnerCount); } } // chooses random crossover point(s) using CrossoverChance private void crossover_onepoint(int parent1, int parent2, int child1, int child2) { Random rnd = NNRandom.getRandom(); int chrom_len = this.pool[0].size; int site = rnd.nextInt(chrom_len); Individual ind1 = this.pool[child1].clone(); Individual ind2 = this.pool[child2].clone(); if(site != 0) { for(int i=site+1; i < chrom_len; i++) { ind1.position[i] = this.pool[parent2].position[i]; ind2.position[i] = this.pool[parent1].position[i]; } } this.pool[child1] = ind1; this.pool[child2] = ind2; } // chooses random crossover point(s) using CrossoverChance private void crossover_twopoint(int parent1, int parent2, int child1, int child2) { Random rnd = NNRandom.getRandom(); int chrom_len = this.pool[0].size; int site1 = rnd.nextInt(chrom_len); int site2 = rnd.nextInt(chrom_len-site1)+site1; // make sure site2 in [site1, chrom_len] Individual ind1 = this.pool[child1].clone(); Individual ind2 = this.pool[child2].clone(); for(int i=site1; i <= site2; i++) { ind1.position[i] = this.pool[parent2].position[i]; ind2.position[i] = this.pool[parent1].position[i]; } this.pool[child1] = ind1; this.pool[child2] = ind2; } // returns True if bit should be flipped private boolean flip() { return NNRandom.getRandom().nextDouble() < params.mutationChance; } private void mutation() { 76 for(int i=0; i < params.genSize; i++) { for(int j=0; j < this.pool[0].size; j++) { // converting double into long (so mutation can be done on bits) long bits = Double.doubleToLongBits(this.pool[i].position[j]); for(int k=0; k < Long.SIZE; k++) { if( flip() ) { bits ^= 1 << k; } } this.pool[i].position[j] = Double.longBitsToDouble(bits); } } } /** * Defines individual to be used in Genetic Algorithm. */ private class Individual { public Double[] position; public Object fitness; protected int size; protected double pmin; protected double pmax; // fitness of individual // create new particle and fill with data from p Individual(Individual p) /*throws IncompatibleDataException*/ { reset(p.size); copy(p); } Individual(int size, double pmin, double pmax) { reset(size); this.pmin = pmin; this.pmax = pmax; mutation(1, pmin, pmax); } private void reset(int size) { //create initial arrays this.size = size; position = new Double[size]; fitness = Integer.MAX_VALUE; } // debug only protected void printPosition() { if(DEBUG) { for(int i=0; i < this.size; i++) { System.out.print(position[i] + " "); } System.out.print("\n"); System.out.flush(); } } @Override protected Individual clone() { return new Individual(this); } 77 // copies particle elements private void copy(Individual p) { for(int i=0; i < this.size; i++) { this.position[i] = p.position[i]; this.fitness = p.fitness; } } // set position array value, making sure it falls within bounds public void setPosition(int index, double value) { this.position[index] = value; } public void mutation(double mutationProbability) { mutation(mutationProbability, this.pmin, this.pmax); } // mutates individual weights using Mutation probability private void mutation(double mutationProbability, double min, double max) { Random r = NNRandom.getRandom(); double range = max - min; for(int i=0; i < this.size; i++) { if(r.nextDouble() < mutationProbability) { this.position[i] = r.nextDouble() * range + min; } } } } } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.NeighborAnnealing.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer; import java.util.Random; import edu.csus.ecs.ssnn.nn.*; import edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface; /** * * Neighbor Annealing to replace backpropogation for training NN. * * A point in the search space consists of one set of data points that contain * all the weights for the network. * * Neighbor annealing generates a new point no farther than DELTA away * from the current point and compares fitness. * If new point has better fitness, moves to the new point and repeats. * Each iteration, DELTA is decreased until endpoint is reached. * */ public class NeighborAnnealing extends TrainingAlgorithm { public static class Params { protected double startDelta; protected double endDelta; 78 protected double stepSize; // amount to adjust delta by each iteration protected FitnessInterface fitnessFxn; // chosen fitness function // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> public public public public double getStartDelta() { return double getEndDelta() { return double getStepSize() { return FitnessInterface getFitness() { this.startDelta; } this.endDelta; } this.stepSize; } return fitnessFxn; } public void setStartDelta(double startDelta) { this.startDelta = startDelta; } public void setEndDelta(double endDelta) { this.endDelta = endDelta; } public void setStepSize(double stepSize) { this.stepSize = stepSize; } public void setFitness(FitnessInterface f) { this.fitnessFxn = f; } // </editor-fold> } protected Params params; // Constructor public NeighborAnnealing(Params params, int maxIterations, double[] acceptableError) { super(maxIterations, acceptableError); this.params = params; } // train network against dataset // For fairness, iterations are incremented each time a single dataset pair tested public boolean train(NeuralNetwork n, DataSet d) throws IncompatibleDataException, NeuralNetworkException { VerifyInputs(n, d); iterations = 0; boolean converged = false; double delta = params.startDelta; int weightCount = n.getNumWeights(); Double[] next; Object next_fitness; // includes weights for bias node // holds neighbor position // holds neighbor fitness // create initial weights (defaulting to current network weights) // which are either random or last network weights // get initial fitness and check for convergence Double[] curr = new Double[weightCount]; int i = -1; for(double weight: n.getWeights()) { curr[++i] = weight; } Object curr_fitness = params.fitnessFxn.getFitness(curr, n, d, acceptableError); iterations += d.size(); // account for initial fitness calc if(params.fitnessFxn.hasConverged(curr_fitness)) { converged = true; } 79 // calculate weight range for scaling deltas double weight_range = n.getWeightMax() - n.getWeightMin(); // generate and check neighbors while(delta > params.endDelta && iterations < maxIterations && !converged) { iterations += d.size(); // scale diff from max start delta of [0, 1] out to domain [weightMin, weightMax] double scaled_delta = delta * weight_range + n.getWeightMin(); next = neighbor(curr, scaled_delta); next_fitness = params.fitnessFxn.getFitness(next, n, d, acceptableError); if(params.fitnessFxn.compare(next_fitness, curr_fitness)) { curr = next; curr_fitness = next_fitness; // check for convergence if(params.fitnessFxn.hasConverged(curr_fitness)) { converged = true; } } // adjust delta delta -= params.stepSize; ReportProgress(); } cleanNetwork(n); // Clean up training data // return success/failure return converged; } protected static Double[] neighbor(Double[] pos, double delta) { Double[] new_pos = new Double[pos.length]; Random r = NNRandom.getRandom(); double diff; for(int i=0; i < pos.length; i++) { // diff in [-delta, delta] diff = r.nextDouble() * 2 * delta - (delta/2.0); new_pos[i] = pos[i] + diff; } return new_pos; } } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.ParticleSwarmOptimization.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer; import java.util.Random; import edu.csus.ecs.ssnn.nn.*; import edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface; 80 /** * * Particle Swarm Optimization to replace backpropogation for training NN. * * A particle consists of one set of data points that contain all the weights * for the network. * * Fitness function is determined by setting weights from particle values and * running through the training data comparing outputs. * 1st classification is # of entries outside of acceptable. * 2nd classification is average difference from output, for ACCEPTABLE results. */ public class ParticleSwarmOptimization extends TrainingAlgorithm { public static protected protected protected protected protected class Params { int swarmSize; double c0; // randomization constant (velocity) double c1; // randomization constant (personal) double c2; // randomization constant (global) FitnessInterface fitnessFxn; // chosen fitness function // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> public public public public public int getSwarmSize() { return this.swarmSize; } double getC0() { return this.c0; } double getC1() { return this.c1; } double getC2() { return this.c2; } FitnessInterface getFitness() { return fitnessFxn; } public void setSwarmSize(int swarmSize) { this.swarmSize = swarmSize; } public void setC0(double c0) { this.c0 = c0; } public void setC1(double c1) { this.c1 = c1; } public void setC2(double c2) { this.c2 = c2; } public void setFitness(FitnessInterface f) { this.fitnessFxn = f; } // </editor-fold> } protected Params params; private Particle[] swarm; protected int g; // holds individual particles in the swarm // global best // Constructor public ParticleSwarmOptimization(Params params, int maxIterations, double[] acceptableError) { super(maxIterations, acceptableError); this.params = params; } 81 // train network against dataset // Each particle defines a list of weights for the network // Particles are tested against all points in the dataset, then moved // to a new set of weights. // For fairness, iterations are incremented each time a single dataset pair tested public boolean train(NeuralNetwork n, DataSet d) throws IncompatibleDataException, NeuralNetworkException { VerifyInputs(n, d); iterations = 0; boolean converged = false; double weightMin = n.getWeightMin(); double weightMax = n.getWeightMax(); int weightCount = n.getNumWeights(); Object fitness; // reset iteration count // min possible value for weight // max possible value for weight // includes weights for bias node // Initialize Particles and randomize values this.g = 0; this.swarm = new Particle[params.swarmSize]; for(int i=0; i < params.swarmSize; i++) { // each particle scans through entire dataset // have to increase iterations in init phase because we increase // chance to find convergent solution during init phase by increasing // swarmSize. Otherwise, a sufficiently large swarmSize would likely // return convergent value "for free" if we're not counting iterations. iterations += d.size(); Particle p = new Particle(weightCount, weightMin, weightMax); // set first particle to existing network weights // these will either be random, or the last set weights if(i == 0) { int cnt = -1; for(double weight: n.getWeights()) { p.setPosition(++cnt, weight); } } p.best_fitness = params.fitnessFxn.getFitness(p.position, n, d, acceptableError); this.swarm[i] = p; // find initial global best if(i == 0 || params.fitnessFxn.compare(p.best_fitness, swarm[g].best_fitness)) { g = i; // check for convergence if(params.fitnessFxn.hasConverged(swarm[g].best_fitness)) { converged = true; break; } } } // train! while(iterations < maxIterations && !converged) { for(int i=0; i < params.swarmSize; i++) { // move particle update_velocity(swarm[i]); update_position(swarm[i]); 82 fitness = params.fitnessFxn.getFitness(swarm[i].position, n, d, acceptableError); // each particle scans through entire dataset iterations += d.size(); // compare against particles pBest, set new pBest as appropriate if(params.fitnessFxn.compare(fitness, swarm[i].best_fitness) ) { swarm[i].setBest(); swarm[i].best_fitness = fitness; // check any new pbests to see if they are gbest also if(params.fitnessFxn.compare(fitness, swarm[g].best_fitness)) { if(DEBUG){ System.out.println("Found new gbest. " + "Was (" + g + "," + swarm[g].best_fitness + "). " + "Now (" + i + "," + swarm[i].best_fitness + ")."); } g = i; // check for convergence if(params.fitnessFxn.hasConverged(swarm[g].best_fitness)) { converged = true; break; //break out to while loop } } } } ReportProgress(); } cleanNetwork(n); // Clean up training data return converged; } private void update_velocity(Particle p) { Random r = NNRandom.getRandom(); double rand_c0 = params.c0 * r.nextDouble(); double rand_c1 = params.c1 * r.nextDouble(); double rand_c2 = params.c2 * r.nextDouble(); // each dimension of array needs to be calculated separately for(int i=0; i < p.size; i++) { //p.velocity[x] = c0*p.velocity[x] + c1*rnd * (p.best[x] - p.position[x]) + c2*rnd * (g.best[x] - p.position[x]) double pbest_delta = rand_c1 * (p.best[i] - p.position[i]); double gbest_delta = rand_c2 * (swarm[g].best[i] - p.position[i]); p.setVelocity(i, rand_c0*p.velocity[i] + pbest_delta + gbest_delta); } } private void update_position(Particle p) { for(int i=0; i < p.size; i++) { p.setPosition(i, p.position[i] + p.velocity[i]); } } /** 83 * Defines particle to be use for Particle Swarm Optimization. */ private class Particle { public Double[] position; public Double[] velocity; public Double[] best; public Object best_fitness; protected double pmin; // protected double pmax; // protected double vmax; // protected int size; // fitness of pbest minimum weight value maximum weight value maximum velocity change. from pmin/max // create new particle and fill with data from p Particle(Particle p) throws IncompatibleDataException { reset(p.size); copy(p); } Particle(int size, double pmin, double pmax) { reset(size); this.pmin = pmin; this.pmax = pmax; this.vmax = this.pmax - this.pmin; randomize(position, this.pmin, this.pmax); randomize(velocity, -this.vmax, this.vmax); setBest(); } private void reset(int size) { //create initial arrays this.size = size; position = new Double[size]; velocity = new Double[size]; best = new Double[size]; } // debug only protected void printPosition() { if(DEBUG) { for(int i=0; i < this.size; i++) { System.out.print(position[i] + " "); } System.out.print("\n"); System.out.flush(); } } // debug only protected void printBest() { if(DEBUG) { for(int i=0; i < this.size; i++) { System.out.print(best[i] + " "); } System.out.print("\n"); System.out.flush(); } } // copies particle elements private void copy(Particle p) throws IncompatibleDataException { 84 if(p.size != this.size) { throw new IncompatibleDataException("Particles are of different size. Unable to copy."); } for(int i=0; i < this.size; i++) { this.position[i] = p.position[i]; this.velocity[i] = p.velocity[i]; this.best[i] = p.best[i]; } } // set new best array public final void setBest() { System.arraycopy(this.position, 0, this.best, 0, this.size); } // randomize elements of array, bounding each new random entry private void randomize(Double[] l, double min, double max) { Random r = NNRandom.getRandom(); double range = (max - min); for(int i=0; i < this.size; i++) { //set to random value within appropriate range l[i] = r.nextDouble() * range + min; } } // set velocity array value, making sure it falls within bounds public void setVelocity(int index, double value) { if(value > this.vmax) { value = this.vmax; } else if(value < -this.vmax) { value = -this.vmax; } this.velocity[index] = value; } // set position array value, making sure it falls within bounds public void setPosition(int index, double value) { if(value > this.pmax) { value = this.pmax; } else if(value < this.pmin) { value = this.pmin; } this.position[index] = value; } } } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithm.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer; import java.util.ArrayList; import edu.csus.ecs.ssnn.nn.*; import edu.csus.ecs.ssnn.event.TrainingProgressEvent; import edu.csus.ecs.ssnn.event.TrainingProgressEventListenerInterface; 85 /** * Abstract training algorithm that provides several helper functions for * training algorithms. */ public abstract class TrainingAlgorithm implements TrainingAlgorithmInterface { /** * can be enabled before compilation to print out extra debug statements * (if set to false, debug statement should be left out of compilation because flag is final) */ public static final boolean DEBUG=false; /** * Provides functionality to pass a list of parameters specific to a training algorithm. * Individual training algorithms overwrite this class to provide their own parameters. * Passing parameters in this way reduces the number of places in the code that must be * updated when adding new algorithms. */ public abstract static class Params { }; /** * Correctness criteria */ protected double[] acceptableError; /** * Number of training iterations to attempt */ protected int maxIterations; /** * Interval in iterations between TrainingProgressEvents */ protected int reportInterval; /** * Current number of iterations attempted */ protected int iterations; protected ArrayList<TrainingProgressEventListenerInterface> listeners; // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> /** * * @return correctness criteria */ public double[] getAcceptableError() { return this.acceptableError; } /** * * @return current training iterations */ public int getIterations() { 86 return this.iterations; } /** * * @return total allowed training iterations */ public int getMaxIterations() { return this.maxIterations; } /** * * @return interval (in interations) between TrainingProgressEvents */ public int getReportInterval() { return this.reportInterval; } /** * * @param acceptableError * correctness criteria */ public void setAcceptableError(double[] acceptableError) { this.acceptableError = acceptableError; } /** * * @param maxIterations * total allowed training iterations */ public void setMaxIterations(int maxIterations) { this.maxIterations = maxIterations; } /** * * @param reportInterval * interval (in iterations) between TrainingProgressEvents */ public void setReportInterval(int reportInterval) { this.reportInterval = reportInterval; } // </editor-fold> /** * Constructor. Sets variables required for all training algorithms. * * @param maxIterations * Number of training iterations to attempt before stopping * @param acceptableError * Correctness criteria for each dimension of output */ public TrainingAlgorithm(int maxIterations, double[] acceptableError) { this.maxIterations = maxIterations; this.acceptableError = acceptableError; listeners = new ArrayList<TrainingProgressEventListenerInterface>(); iterations = 0; reportInterval = 1000; } 87 /** * Does basic validation of network and dataset to ensure compatibility. * * @param n * Network to verify * @param d * Dataset to verify * @throws IncompatibleDataException * If dataset dimensions don't match network dimensions * @throws NeuralNetworkException * If dataset is empty */ protected static void VerifyInputs(NeuralNetwork n, DataSet d) throws IncompatibleDataException, NeuralNetworkException { // Ensure data set has same # of inputs and outputs as the network if (n.getNumInputs() != d.getNumInputs() || n.getLayer(n.getLayerCount() - 1).getNodeCount() != d.getNumOutputs()) { throw new IncompatibleDataException( "Dataset has different number of inputs and outputs than the network.\n" + "Dataset: " + d.getNumInputs() + ", " + d.getNumOutputs() + "\n" + "Network: " + n.getNumInputs() + ", " + n.getLayer(n.getLayerCount() - 1).getNodeCount()); } if (d.size() == 0) { throw new NeuralNetworkException("Dataset is empty"); } } /** * Removes training data from nodes of the network * * @param n * Neural Network to clean */ protected static void cleanNetwork(NeuralNetwork n) { for (NetworkLayer l : n) { for (NetworkNode nn : l) { nn.setTrainingData(null); } } } /** * Updates training progress listener with current status * at regular intervals. */ protected void ReportProgress() { if (iterations % reportInterval == 0) { double percentComplete = ((double)iterations/(double)maxIterations) * 100.0; TrainingProgressEvent e = new TrainingProgressEvent(this, iterations, percentComplete); for (TrainingProgressEventListenerInterface l : listeners) { l.trainingProgress(e); } Thread.yield(); } } 88 /** * Loads array of weights into the neural network * * @param weights * Array of weights (must match network node weight count) * @param n * Network to set weights on */ protected static void setWeights(Double[] weights, NeuralNetwork n) { n.setWeights(weights); } // <editor-fold defaultstate="collapsed" desc="Listener Methods"> public void addTrainingEventListener(TrainingProgressEventListenerInterface listener) { listeners.add(listener); } public void removeTrainingEventListener(TrainingProgressEventListenerInterface listener) { listeners.remove(listener); } public void clearListeners() { listeners.clear(); } // </editor-fold> } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer; import edu.csus.ecs.ssnn.nn.*; import edu.csus.ecs.ssnn.event.TrainingProgressEventListenerInterface; /** * Interface for all training algorithms. */ public interface TrainingAlgorithmInterface { /** * Trains the network. * * @param n * Neural network to train. * @param d * Training set * @return True if network converged, false otherwise. * @throws NeuralNetworkException */ public boolean train(NeuralNetwork n, DataSet d) throws NeuralNetworkException; /** * * @return Number of iterations taken. */ public int getIterations(); /** 89 * * @return Maximum number of iterations allowed. */ public int getMaxIterations(); /** * * @return Frequency (in iterations) to send training events. */ public int getReportInterval(); /** * * @param maxIterations * Maximum number of iterations allowed. */ public void setMaxIterations(int maxIterations); /** * * @param reportInterval * Frequency (in iterations) to send training events. */ public void setReportInterval(int reportInterval); public void addTrainingEventListener(TrainingProgressEventListenerInterface listener); public void removeTrainingEventListener(TrainingProgressEventListenerInterface listener); public void clearListeners(); } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.fitness.FitnessInterface.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer.fitness; import edu.csus.ecs.ssnn.nn.DataSet; import edu.csus.ecs.ssnn.nn.NeuralNetwork; import edu.csus.ecs.ssnn.nn.NeuralNetworkException; /** * Interface to provide fitness function functions :). */ public interface FitnessInterface { /** * Calculates and returns fitness of network against training set. * Returns as an object to allow for different fitness functions to return * different data types. * * @param weights * Weights to set on neural network. * @param n * Network to use for calculating fitness. * @param trainingData * Training set to calculate fitness against. * @param acceptableError * Acceptable error for outputs to differ from expected. * * @return Object containing fitness * @throws NeuralNetworkException */ public Object getFitness(Double[] weights, NeuralNetwork n, DataSet trainingData, double[] acceptableError) throws NeuralNetworkException; 90 /** * Checks passed fitness value for convergence. * * @param fitness * Calculated fitness value * @return Nothing */ public boolean hasConverged(Object fitness); /** * Compares 2 fitness values to determine if new is better than original * * @param orig_fitness * Original calculated fitness value * * @param new_fitness * New calculated fitness value * @return Nothing */ public boolean compare(Object new_fitness, Object orig_fitness); } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.fitness.LargestChunk.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer.fitness; import java.util.List; import import import import edu.csus.ecs.ssnn.nn.DataSet; edu.csus.ecs.ssnn.nn.NeuralNetwork; edu.csus.ecs.ssnn.nn.NeuralNetworkException; edu.csus.ecs.ssnn.nn.DataPair; /** * Fitness is determined by largest chunk size of contiguously solved points. * Larger fitness is better. */ public class LargestChunk implements FitnessInterface { int totalSize; /** * Constructor. */ public LargestChunk() { this.totalSize = 0; } /** * Fitness is determined by largest chunk szie of contiguously solved points. * * @param weights * Weights to set on neural network. * @param n * Network to use for calculating fitness. * @param trainingData * Training set to calculate fitness against. 91 * @param acceptableError * Acceptable error for outputs to differ from expected. * * @return Object containing fitness * @throws NeuralNetworkException */ public Object getFitness(Double[] weights, NeuralNetwork n, DataSet trainingData, double[] acceptableError) throws NeuralNetworkException { boolean correct; int maxChunkSize = 0; int curChunkSize = 0; n.setWeights(weights); this.totalSize = trainingData.size(); double[] errors = new double[n.getNumOutputs()]; for(DataPair dp: trainingData) { // run training data List<Double> inputs = dp.getInputs(); List<Double> outs = n.getOutputs(inputs); // Compute output errors on each dimension for (int i = 0; i < n.getNumOutputs(); i++) { // Error = desired output - actual output errors[i] = dp.getOutput(i) - outs.get(i); } // See if all outputs are within acceptable error correct = true; for (int i = 0; i < errors.length; i++) { if (Math.abs(errors[i]) > acceptableError[i]) { correct = false; break; } } // set chunks if(correct) { curChunkSize++; } else { if( curChunkSize > maxChunkSize ) { maxChunkSize = curChunkSize; curChunkSize = 0; } } } // final check in case largest chunk is at the end if( curChunkSize > maxChunkSize ) { maxChunkSize = curChunkSize; } return new Integer(maxChunkSize); } public boolean hasConverged(Object fitness) { // if chunk covers entire data set return((Integer) fitness == this.totalSize); } 92 public boolean compare(Object new_fitness, Object orig_fitness) { // new chunk is larger than old chunk return ( (Integer) new_fitness > (Integer) orig_fitness); } } /* =================================================================== edu.csus.ecs.ssnn.nn.trainer.fitness.SimpleUnsolvedPoints.java =================================================================== */ package edu.csus.ecs.ssnn.nn.trainer.fitness; import java.util.List; import import import import edu.csus.ecs.ssnn.nn.DataSet; edu.csus.ecs.ssnn.nn.NeuralNetwork; edu.csus.ecs.ssnn.nn.NeuralNetworkException; edu.csus.ecs.ssnn.nn.DataPair; /** * Calculates fitness by number of unsolved points. Lower fitness is better. */ public class SimpleUnsolvedPoints implements FitnessInterface { /** * Calculates fitness by number of unsolved points. * * @param weights * Weights to set on neural network. * @param n * Network to use for calculating fitness. * @param trainingData * Training set to calculate fitness against. * @param acceptableError * Acceptable error for outputs to differ from expected. * * @return Object containing fitness * @throws NeuralNetworkException */ public Object getFitness(Double[] weights, NeuralNetwork n, DataSet trainingData, double[] acceptableError) throws NeuralNetworkException { int incorrectOutput = 0; n.setWeights(weights); for(DataPair dp: trainingData) { // run training data List<Double> inputs = dp.getInputs(); List<Double> outs = n.getOutputs(inputs); // Compute output errors on each dimension double[] errors = new double[n.getNumOutputs()]; for (int i = 0; i < n.getNumOutputs(); i++) { // Error = desired output - actual output errors[i] = dp.getOutput(i) - outs.get(i); } // See if all outputs are within acceptable error for (int i = 0; i < errors.length; i++) { if (Math.abs(errors[i]) > acceptableError[i]) { incorrectOutput++; // using # of miscalculations break; } 93 } } return new Integer(incorrectOutput); } public boolean hasConverged(Object fitness) { return((Integer) fitness == 0); } public boolean compare(Object new_fitness, Object orig_fitness) { return ( (Integer) new_fitness < (Integer) orig_fitness); } } /* =================================================================== edu.csus.ecs.ssnn.splittrainer.AreaBasedBinarySplitTrainer.java =================================================================== */ package edu.csus.ecs.ssnn.splittrainer; import edu.csus.ecs.ssnn.nn.*; import java.util.ArrayList; import java.util.List; public class AreaBasedBinarySplitTrainer extends SSNNTrainingAlgorithm { /* * Used to pass region / randomize values when queuing networks */ protected class QueueDataRegion { public DataRegion region; public boolean saveWeights; public QueueDataRegion(DataRegion region, boolean saveWeights) { this.region = region; this.saveWeights = saveWeights; } } public AreaBasedBinarySplitTrainer() { super(); } /* Area Based Binary Splitting algorithm * Find largest solved chunk and split network as follows: * [smaller unsolved edge + solved] [larger unsolved] * Currently, smaller is defined as "less area" * rather than less points contained within region * if a sufficiently large chunk cannot be found does a centroid split. * * network that includes the solved chunk must initialize with last weights. */ @Override protected SplitType splitNetwork(NeuralNetwork n, DataRegion r, DataSet d) throws NeuralNetworkException { Chunk best = findBestChunk(n, d); 94 // Now determine what kind of split to do... if (best.length >= minChunkSize) { DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(best.dimension); double domainMin = sortedData.getPair(0).getInput(best.dimension); double domainMax = sortedData.getPair(sortedData.size() 1).getInput(best.dimension); // Somehow our chunk covers the entire domain /* * It's actually possible for this to happen, although it is * extremely rare. If the training algorithm does not converge, but * its very last adjustment to the network made it perfect, this * case will occur. */ if (best.start == domainMin && best.end == domainMax) { return SplitType.unnecessary; } // Are we on the lower or upper edge? else if (best.start == domainMin || best.end == domainMax) { // Make sure there are at least two distinct values in the chunk if (countDistinctValuesInChunk(best.start, best.end, sortedData, best.dimension) < 2) { // Can't split (too few values) - try centroid return centroidSplit(n, r, d); } else { // split based on min/max List<DataRegion> splitRegions; ArrayList<Double> splitPoints = new ArrayList<Double>(); ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>(); ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>(); // lower half is solved if(best.start == domainMin) { splitRegions = r.split(best.dimension, best.end); unsolvedRegions.add(splitRegions.get(1)); solvedRegions.add(splitRegions.get(0)); splitPoints.add(best.end); //upper half is solved } else { splitRegions = r.split(best.dimension, best.start); unsolvedRegions.add(splitRegions.get(0)); solvedRegions.add(splitRegions.get(1)); splitPoints.add(best.start); } // Remove existing network trainingMNetwork.removeNetwork(n); // Re-add solved region queueSolvedNetwork(n, solvedRegions); // create new network for the unsolved region queueUnsolvedNetworks(n, unsolvedRegions); // Create and dispatch split event HandleSplit(SplitType.chunk, unsolvedRegions, solvedRegions, splitPoints, best.length, best.dimension); return SplitType.chunk; 95 } } // We're in the middle of the data set else { // Need at least three distinct values in the chunk if (countDistinctValuesInChunk(best.start, best.end, sortedData, best.dimension) < 3) { // Can't split - try centroid return centroidSplit(n, r, d); } else { // split data into [smaller unsolved + solved] [unsolved] // first find [unsolved] [solved] [unsolved] // then determine which [unsolved] is smaller List<DataRegion> firstSplit = r.split(best.dimension, best.start); List<DataRegion> secondSplit = firstSplit.get(1).split(best.dimension, best.end); ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>(); ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>(); List<DataRegion> finalSplit; // lower unsolved is smaller, so include with solved network if(firstSplit.get(0).getArea() < secondSplit.get(1).getArea()) { best.start = r.getMin(best.dimension); finalSplit = r.split(best.dimension, best.end); unsolvedRegions.add(finalSplit.get(1)); unsolvedRegions.add(finalSplit.get(0)); // save weights } else { // upper unsolved is smaller, so include with solved network best.end = r.getMax(best.dimension); finalSplit = r.split(best.dimension, best.start); unsolvedRegions.add(finalSplit.get(1)); unsolvedRegions.add(finalSplit.get(0)); // save weights } ArrayList<Double> splitPoints = new ArrayList<Double>(); splitPoints.add(best.start); splitPoints.add(best.end); // Remove existing network trainingMNetwork.removeNetwork(n); // create new networks for the unsolved upper / lower regions List<QueueDataRegion> queueUnsolved = new ArrayList<QueueDataRegion>(); queueUnsolved.add(new QueueDataRegion(unsolvedRegions.get(0), false)); queueUnsolved.add(new QueueDataRegion(unsolvedRegions.get(1), true)); queueUnsolvedNetworks_SaveWeights(n, queueUnsolved); // Create and dispatch split event HandleSplit(SplitType.chunk, unsolvedRegions, solvedRegions, splitPoints, best.length, best.dimension); return SplitType.chunk; } } // No chunk or chunk too small 96 } else { sizeFailures++; return centroidSplit(n, r, d); } } /* * Splits Dataset across 2 neural networks with no solved regions. */ private SplitType centroidSplit(NeuralNetwork n, DataRegion r, DataSet d) throws NeuralNetworkException { // To select the dimension to split along, find the one with the most // distinct values int bestDimension = 0; int mostDistinctValues = 0; for (int inputNum = 0; inputNum < n.getNumInputs(); inputNum++) { DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(inputNum); int currDistinctValues = countDistinctValuesInChunk( sortedData.getPair(0).getInput(inputNum), sortedData.getPair(sortedData.size() - 1).getInput(inputNum), sortedData, inputNum); if (currDistinctValues > mostDistinctValues) { mostDistinctValues = currDistinctValues; bestDimension = inputNum; } } if (mostDistinctValues > 2) { DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(bestDimension); int splitPoint = mostDistinctValues / 2; // creating solved regions list (empty), unsolved regions, and split points list ArrayList<Double> splitPoints = new ArrayList<Double>(); splitPoints.add(sortedData.getPair(splitPoint).getInput(bestDimension)); List<DataRegion> unsolvedRegions = r.split(bestDimension, sortedData .getPair(splitPoint).getInput(bestDimension)); List<DataRegion> solvedRegions = new ArrayList<DataRegion>(); // removing original network - replaced by new split networks trainingMNetwork.removeNetwork(n); // create new networks for the unsolved upper / lower regions queueUnsolvedNetworks(n, unsolvedRegions); HandleSplit(SplitType.centroid, unsolvedRegions, solvedRegions, splitPoints, 0, bestDimension); return SplitType.centroid; } // No way to split return SplitType.impossible; } 97 /* * Providing new queuing method to allow for non-randomized weights * Rather than pass a list of regions, passing a list of <region, doRandomize> pairs */ protected void queueUnsolvedNetworks_SaveWeights(NeuralNetwork nnTopo, List<QueueDataRegion> regions) throws NeuralNetworkException{ for(QueueDataRegion q: regions) { DataRegion scalingRegion; NeuralNetwork n = new NeuralNetwork(nnTopo.getNumInputs(), nnTopo.getNumOutputs(), nnTopo.getHiddenTopology()); if(!q.saveWeights) { n.randomizeWeights(0.1); n.setInputScaling(q.region, trainingMNetwork.getInputRange()); scalingRegion = q.region; } else { // clones weights for new network - this is important if future algorithms // can use the same nnTopo to create multiple networks, as cloning ensures // they aren't sharing the same underlying Double objects n.setWeights(nnTopo.getWeights().toArray(new Double[]{})); scalingRegion = nnTopo.getScalingRegion(); } n.setInputScaling(scalingRegion, trainingMNetwork.getInputRange()); trainingMNetwork.addNetwork(n, q.region); networkQueue.add(trainingMNetwork.getNetworkRecord(trainingMNetwork.getNetworkCount()1)); } } /* * Count all distinct values in the dimension that fall between min / max */ private int countDistinctValuesInChunk(double minValue, double maxValue, DataSet sortedData, int chunkDimension) { int count = 0; double lastValue = Double.MIN_VALUE; for (DataPair p : sortedData) { double curValue = p.getInput(chunkDimension); if (curValue > maxValue) { break; } if (curValue >= minValue) { if (curValue != lastValue) { lastValue = curValue; count++; } } } return count; } /* * Finds largest solved chunk across all domains */ private Chunk findBestChunk(NeuralNetwork n, DataSet d) throws NeuralNetworkException { Chunk current; Chunk best = new Chunk(); 98 // determine scaling slopes for original network // this will match sorted dataset, as domain min/maxes aren't changed //List<Double> inputSlopes = d.getInputScalingSlope(trainingMNetwork.getScalingInputMin(), trainingMNetwork.getScalingInputMax()); //List<Double> outputSlopes = d.getOutputScalingSlope(trainingMNetwork.getScalingOutputMin(), trainingMNetwork.getScalingOutputMax()); // For each input dimension, find largest solved chunk for (int inputNum = 0; inputNum < d.getNumInputs(); inputNum++) { // Create a dataset which is sorted on the input dimension DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(inputNum); current = new Chunk(); double lastFail = 0; // reset chunk for new dimension for (int pairNum = 0; pairNum < sortedData.size(); pairNum++) { DataPair currentPair = sortedData.getPair(pairNum); // fast-forward through points with same dimensional value if original failed // (don't want to split on a dimensional point with a mix of good/bad values) if(pairNum > 0 && current.length == 0 && currentPair.getInput(inputNum) == lastFail) { continue; } List<Double> outputs = n.getOutputs(currentPair.getInputs()); // Check each output for correctness against expected for (int outputNum = 0; outputNum < outputs.size(); outputNum++) { // output is incorrect, reset chunk if (Math.abs(outputs.get(outputNum) currentPair.getOutput(outputNum)) > acceptableErrors[outputNum]) { // found a new best! if(current.length > best.length) { best = current; } current = new Chunk(); // reset chunk } // output is correct, increment chunk else { current.length++; current.end = currentPair.getInput(inputNum); if(current.length == 1) { // this is a new chunk current.start = current.end; current.dimension = inputNum; } } } // extra check to make sure last chunk wasn't best if(current.length > best.length) { best = current; } } } 99 // best should now hold best chunk values return best; } } /* =================================================================== edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithm.java =================================================================== */ package edu.csus.ecs.ssnn.splittrainer; import import import import import import import import edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface; edu.csus.ecs.ssnn.nn.*; java.util.ArrayList; java.util.Iterator; java.util.LinkedList; java.util.List; java.util.Map; java.util.EnumMap; import import import import import import import edu.csus.ecs.ssnn.data.TrainingResults; edu.csus.ecs.ssnn.event.NetworkConvergedEvent; edu.csus.ecs.ssnn.event.NetworkSplitEvent; edu.csus.ecs.ssnn.event.NetworkTrainingEvent; edu.csus.ecs.ssnn.event.TrainingCompletedEvent; edu.csus.ecs.ssnn.event.SplitTrainerEventListenerInterface; edu.csus.ecs.ssnn.nn.ModularNeuralNetwork.NetworkRecord; /** * Base class to use for SSNN Training algorithms. It provides several helper * fucnctions to reduce the amount of effort to create a new algorithm. */ public abstract class SSNNTrainingAlgorithm implements SSNNTrainingAlgorithmInterface { /** * Type of network split that occurred. */ protected static enum SplitType { chunk, centroid, impossible, unnecessary } /** * A chunk is a set of contiguous solved points. * on where the chunk falls in the dataset. */ protected class Chunk { double start; double end; int dimension; int length; Chunk() { this.start = Double.MIN_VALUE; this.end = Double.MIN_VALUE; this.dimension = -1; this.length = 0; } } This class holds information 100 /** * Contains a list of counts for each split type. */ protected Map<SplitType, Integer> splitCounts; /** * Number of splits that failed because resulting training set was too small. */ protected int sizeFailures; /** * Start time of training. */ protected long startTime; Used for calculating how long training networks takes. /** * Total area to be solved by the modular neural network. */ protected double totalArea; /** * Current number of iterations taken to solve the network. */ protected int totalIterations; /** * The modular neural network containing solved networks and associated regions. */ protected ModularNeuralNetwork trainingMNetwork; /** * List of networks that still need to be trained. */ protected LinkedList<NetworkRecord> networkQueue; /** * Event listeners for splitting. */ protected ArrayList<SplitTrainerEventListenerInterface> listeners; // <editor-fold defaultstate="collapsed" desc="Settable properties for training."> /** * Algorithm used to train individual neural networks. */ protected TrainingAlgorithmInterface nn_alg; /** * Correctness criteria for determining if output was within acceptable error. */ protected double[] acceptableErrors; /** * Flag used to stop training if a network is unsolvable and cannot be split. */ protected boolean failOnUnsplittableNetwork; /** * Minimum size of training data allowed for any individual network. */ protected int minChunkSize; /** 101 * Training set used to train the neural networks. */ protected DataSet trainingData; // </editor-fold> // <editor-fold defaultstate="collapsed" desc="Get/Set Properties"> /** * @return Correctness criteria for determining if output was within acceptable error. */ public double[] getAcceptableErrors() { return this.acceptableErrors; } /** * @return Minimum size of training data allowed for any individual network. */ public int getMinChunkSize() { return minChunkSize; } /** * * @return Flag used to stop training if a network is unsolvable and cannot be split. */ public boolean isFailOnUnsplittableNetwork() { return failOnUnsplittableNetwork; } /** * * @param acceptableErrors * Correctness criteria for determining if output was within acceptable error. */ public void setAcceptableErrors(double[] acceptableErrors) { this.acceptableErrors = acceptableErrors; } /** * * @param minSize * Minimum size of training data allowed for any individual network. * @throws DataFormatException * Thrown if minimum splitting size is less than zero. */ public void setMinChunkSize(int minSize) throws DataFormatException { if(minSize < 1) { throw new DataFormatException("Minimum splitting must be > 0 (was " + minSize + ")"); } minChunkSize = minSize; } /** * * @param fail * Flag used to stop training if a network is unsolvable and cannot be split. */ public void setFailOnUnsplittableNetwork(boolean fail) { failOnUnsplittableNetwork = fail; 102 } // </editor-fold> /** * Constructor. */ public SSNNTrainingAlgorithm() { minChunkSize = 1; failOnUnsplittableNetwork = true; splitCounts = new EnumMap<SplitType, Integer>(SplitType.class); listeners = new ArrayList<SplitTrainerEventListenerInterface>(); } /** * Defines split algorithm to be used. Must be defined by algorithm class. * * @param n * Network to split. * @param r * Data region of network. * @param d * Data set of region. * @return * Type of split determined by network. * @throws NeuralNetworkException */ protected abstract SplitType splitNetwork(NeuralNetwork n, DataRegion r, DataSet d) throws NeuralNetworkException; /** * Basic training framework for splitting and training subnetworks. * Individual splitting algorithms need to define splitNetwork() * * @param mn * Modular neural network to be trained. * @param trainingDataSet * Complete set of training data used to train the MNN. * @param a * Training algorithm used to train individual neural networks. * @throws NeuralNetworkException */ public boolean train(ModularNeuralNetwork mn, DataSet trainingDataSet, TrainingAlgorithmInterface a) throws NeuralNetworkException { boolean nnConverged; // reset all variables before training this.trainingData = trainingDataSet; this.nn_alg = a; this.trainingMNetwork = mn; this.totalIterations = 0; this.startTime = System.currentTimeMillis(); this.totalArea = trainingData.getInputDataRegion().getArea(); this.networkQueue = new LinkedList<NetworkRecord>(); // set all split counts to zero for(SplitType i : SplitType.values()) { splitCounts.put(i, 0); } // Put all existing neural networks into a queue Iterator<NetworkRecord> iter = trainingMNetwork.iterator(); while (iter.hasNext()) { 103 networkQueue.add(iter.next()); } // While there are networks to train while (networkQueue.size() > 0) { // Get the network from the end of the queue and its relevant // region and training data set. Grabbing last network so that // previous weights can be applied (if necessary) NetworkRecord currentNetworkRecord = networkQueue.pop(); //.poll(); NeuralNetwork currentNetwork = currentNetworkRecord.getNeuralNetwork(); DataRegion currentRegion = currentNetworkRecord.getDataRegion(); DataSet regionSpecificDataSet = trainingData.getDataInRegion(currentRegion); currentNetworkRecord.setTrainingSetSize(regionSpecificDataSet.size()); // Try training the network StartTraining(currentRegion); // training with data set nnConverged = TrainNetwork(currentNetworkRecord, regionSpecificDataSet); if (!nnConverged) { // If it doesn't converge, try splitting it. SplitType splitResult = splitNetwork(currentNetwork, currentRegion, regionSpecificDataSet); if (splitResult == SplitType.centroid) { } else if (splitResult == SplitType.chunk) { } else if (splitResult == SplitType.unnecessary) { currentNetworkRecord.setSolved(true); HandleNetworkConverged(currentRegion, nn_alg.getIterations()); } else if (failOnUnsplittableNetwork) { removeUntestedNetworks(); HandleTrainingCompleted(false); return false; } } else { // done with network, converged successfully HandleNetworkConverged(currentRegion, nn_alg.getIterations()); } Thread.yield(); } HandleTrainingCompleted(true); return true; } /** * Walks through queued networks to determine area of data regions * left to solve, then computes against totalArea. * * @return percentage of total area solved by the networks. */ protected final double computeSolvedPercentage() { double unsolvedArea = 0; for (NetworkRecord r : networkQueue) { unsolvedArea += r.getDataRegion().getArea(); 104 } return (totalArea - unsolvedArea) * 100.0/totalArea; } /** * Removes all networks from the MNN that were never tested. * (This occurs when totalIterations is reached before the problem is solved) * Test by checking if solvedatapoint is set * * @throws NeuralNetworkException * Throws exception if unable to add network to the list of untested. */ protected void removeUntestedNetworks() throws NeuralNetworkException { ModularNeuralNetwork untested = new ModularNeuralNetwork(trainingMNetwork.getNumInputs(), trainingMNetwork.getNumOutputs(), trainingMNetwork.getInputRange()); for(NetworkRecord nr: trainingMNetwork) { if(nr.getSolvedDataPoints() == null) { untested.addNetwork(nr.getNeuralNetwork(), nr.getDataRegion()); } } for(NetworkRecord nr: untested) { trainingMNetwork.removeNetwork(nr.getNeuralNetwork()); } } /** * * @param n * Solved neural network to add to MNN. * @param regions * List of regions the solved network can solve. * @throws NeuralNetworkException */ protected void queueSolvedNetwork(NeuralNetwork n, List<DataRegion> regions) throws NeuralNetworkException{ for(DataRegion r: regions) { trainingMNetwork.addNetwork(n, r, true); NetworkRecord nr = trainingMNetwork.getNetworkRecord(trainingMNetwork.getNetworkCount()-1); nr.setTrainingSetSize(trainingData.getDataInRegion(r).size()); } } /** * * @param parentNN * Used to gather neural network topology (inputs, outputs, hidden layers) * to create new unsolved networks. * @param regions * List of new unsolved regions. * @throws NeuralNetworkException * Thrown on errors setting network input scaling or adding the network to the queue. */ protected void queueUnsolvedNetworks(NeuralNetwork parentNN, List<DataRegion> regions) throws NeuralNetworkException{ for(DataRegion r: regions) { NeuralNetwork n = new NeuralNetwork(parentNN.getNumInputs(), parentNN.getNumOutputs(), parentNN.getHiddenTopology()); 105 n.setInputScaling(r, trainingMNetwork.getInputRange()); n.randomizeWeights(0.1); // used by Back Propogation only trainingMNetwork.addNetwork(n, r); networkQueue.add(trainingMNetwork.getNetworkRecord(trainingMNetwork.getNetworkCount()1)); } } /** * Runs through the training set and determines which points are solvable by * the individual network. At a minimum, all points within the initial * solved region should be correctly solved. * * @param nr * Network record containing network and data region associated with it. * @throws NeuralNetworkException * Throws if error creating data sets, calculating network outputs, * or if a point in the network's region (i.e. previously solved) is * now calculated as unsolved). */ protected void setSolvedDataPoints(NetworkRecord nr) throws NeuralNetworkException { DataSet solved = new DataSet(trainingData.getNumInputs(), trainingData.getNumOutputs()); DataSet unsolved = new DataSet(trainingData.getNumInputs(), trainingData.getNumOutputs()); for(DataPair pair: trainingData) { List<Double> outputs = nr.getNeuralNetwork().getOutputs(pair.getInputs()); // Check each output for correctness for (int outputNum = 0; outputNum < outputs.size(); outputNum++) { if (Math.abs(outputs.get(outputNum) - pair.getOutput(outputNum)) > acceptableErrors[outputNum]) { // DEBUG check - yell if unsolved within region if(nr.getDataRegion().containsPoint(pair.getInputs())) { throw new NeuralNetworkException("Network determined as correct, but point in region remains unsolved."); } unsolved.addPair(pair); break; } solved.addPair(pair); } } if(solved.size() == 0 && nr.isSolved()) { throw new NeuralNetworkException("Region determined as solved, but no points in solved set."); } nr.setSolvedDataPoints(solved); nr.setUnsolvedDataPoints(unsolved); } /** * Train an individual neural network. * * @param currentNetworkRecord * Network record that includes the network to be trained * @param regionTrainingSet * Data points from the training set that fall in the region * @return True if network converges, False otherwise. 106 * @throws NeuralNetworkException * Thrown if error occurs in training network */ protected boolean TrainNetwork(NetworkRecord currentNetworkRecord, DataSet regionTrainingSet) throws NeuralNetworkException { NeuralNetwork currentNetwork = currentNetworkRecord.getNeuralNetwork(); boolean converged = nn_alg.train(currentNetwork, regionTrainingSet); currentNetworkRecord.setSolved(converged); totalIterations += nn_alg.getIterations(); return converged; } // <editor-fold defaultstate="collapsed" desc="Event Dispatchers"> /** * Called before starting to train a new neural network. * * @param region * Region to be solved by the network */ protected final void StartTraining(DataRegion region) { NetworkTrainingEvent e = new NetworkTrainingEvent(this, region); dispatchNetworkTrainingEvent(e); } /** * Called by the algorithm after a split occurs. * Currently, the function updates split counters and send an event to * the GUI. * * @param split * Split type that occurred. * @param unsolvedRegions * List of unsolved regions generated from the split. * @param solvedRegions * List of solved regions generated by the split. * @param splitPoints * Points along with the region was split. * @param bestChunkSize * Size of largest chunk generated from the split. * @param bestSplitDimension * Dimension on which splits occurred. */ protected final void HandleSplit(SplitType split ,List<DataRegion> unsolvedRegions ,List<DataRegion> solvedRegions ,List<Double> splitPoints ,int bestChunkSize ,int bestSplitDimension) { NetworkSplitEvent.SplitType networkSplit; // increment appropriate split count splitCounts.put(split, splitCounts.get(split)+1); switch(split) { case chunk: networkSplit = NetworkSplitEvent.SplitType.chunk; break; case centroid: default: networkSplit = NetworkSplitEvent.SplitType.centroid; break; 107 } NetworkSplitEvent e = new NetworkSplitEvent(this, bestChunkSize, networkQueue.size() ,trainingMNetwork.getSolvedCount(), bestSplitDimension, networkSplit ,trainingMNetwork.getNetworkCount(), computeSolvedPercentage() ,unsolvedRegions ,solvedRegions ,splitPoints); dispatchNetworkSplitEvent(e); } /** * Called by the algorithm after the current network has converged. * * @param r * Data region solved by network. * @param iterations * Iterations taken to solve the network. */ protected final void HandleNetworkConverged(DataRegion r, int iterations) { NetworkConvergedEvent nce = new NetworkConvergedEvent(this ,r // solvedRegion ,networkQueue.size() // networksInQueue ,trainingMNetwork.getNetworkCount() // totalNetworks ,computeSolvedPercentage() // percentage solved ,iterations); // iterations dispatchNetworkConvergedEvent(nce); } /** * Called by the algorithm after all network training is complete. * * @param isSolved * Has the modular neural network solved the training set? */ protected final void HandleTrainingCompleted(boolean isSolved) { long endTime = System.currentTimeMillis(); // walk through networks and compute solved values for(int i=0; i < trainingMNetwork.getNetworkCount(); i++) { NetworkRecord nr = trainingMNetwork.getNetworkRecord(i); try { setSolvedDataPoints(nr); } catch(NeuralNetworkException ex) { System.out.println("Error in setSolvedDataPoints: " + ex.getMessage()); } } TrainingResults tr = new TrainingResults(); tr.setTrainingIterations(totalIterations); tr.setChunkSplits(splitCounts.get(SplitType.chunk)); tr.setChunksProduced(trainingMNetwork.getNetworkCount()); tr.setCentroidSplits(splitCounts.get(SplitType.centroid)); tr.setSizeFailures(sizeFailures); tr.setSolved(isSolved); tr.setTrainingIterations(totalIterations); tr.setTrainingDuration((endTime - startTime) / 1000); TrainingCompletedEvent tce = new TrainingCompletedEvent(this, tr); dispatchTrainingCompletedEvent(tce); } 108 // </editor-fold> // <editor-fold defaultstate="collapsed" desc="Listener Methods"> public void addListener(SplitTrainerEventListenerInterface l) { listeners.add(l); } public void removeListener(SplitTrainerEventListenerInterface l) { listeners.remove(l); } public void clearListeners() { listeners.clear(); } /** * * @param e */ protected void dispatchNetworkSplitEvent(NetworkSplitEvent e) { for (SplitTrainerEventListenerInterface l : listeners) { l.networkSplit(e); } } /** * * @param e */ protected void dispatchNetworkConvergedEvent(NetworkConvergedEvent e) { for (SplitTrainerEventListenerInterface l : listeners) { l.networkConverged(e); } } /** * * @param e */ protected void dispatchTrainingCompletedEvent(TrainingCompletedEvent e) { for (SplitTrainerEventListenerInterface l : listeners) { l.trainingCompleted(e); } } /** * * @param e */ protected void dispatchNetworkTrainingEvent(NetworkTrainingEvent e) { for (SplitTrainerEventListenerInterface l : listeners) { l.networkTraining(e); } } // </editor-fold> } /* =================================================================== edu.csus.ecs.ssnn.splittrainer.SSNNTrainingAlgorithmInterface.java =================================================================== */ 109 package edu.csus.ecs.ssnn.splittrainer; import edu.csus.ecs.ssnn.event.SplitTrainerEventListenerInterface; import edu.csus.ecs.ssnn.nn.trainer.TrainingAlgorithmInterface; import edu.csus.ecs.ssnn.nn.*; /** * Interface for SSNN Training algorithms. Interfaces must implement train() * and provide for other components to attach event listeners * */ public interface SSNNTrainingAlgorithmInterface { /** * * @param n * Empty modular neural network. It defines the structure of the * individual neural networks, as well as maintaining internal * lists of networks and other information on the state of training. * @param trainingData * The training set. * @param a * The training algorithm used to train individual neural networks. * @return True if trained successfully, false otherwise. * @throws NeuralNetworkException * Thrown on errors in getting region specific data, training an * individual network, or splitting the network. */ public boolean train(ModularNeuralNetwork n, DataSet trainingData, TrainingAlgorithmInterface a) throws NeuralNetworkException; /** * Add passed listener to collection. The event is triggered on splits. * @param l * Event listener */ public void addListener(SplitTrainerEventListenerInterface l); /** * Removes passed listener from collection. * @param l * Event listener */ public void removeListener(SplitTrainerEventListenerInterface l); /** * Clears all listeners attached to the interface */ public void clearListeners(); } /* =================================================================== edu.csus.ecs.ssnn.splittrainer.TrainedResultsSplitTrainer.java =================================================================== */ package edu.csus.ecs.ssnn.splittrainer; import edu.csus.ecs.ssnn.nn.*; import java.util.ArrayList; import java.util.List; 110 public class TrainedResultsSplitTrainer extends SSNNTrainingAlgorithm { public TrainedResultsSplitTrainer() { super(); } /* Basic SSNN Splitting algorithm * Find largest solved chunk and split network into either: * a) [solved] [unsolved] (if solved is on an edge of region) * b) [unsolved] [solved] [unsolved] (if solved in the middle of region) * if a sufficiently large chunk cannot be found does a centroid split. */ @Override protected SplitType splitNetwork(NeuralNetwork n, DataRegion r, DataSet d) throws NeuralNetworkException { Chunk best = findBestChunk(n, d); // Now determine what kind of split to do... if (best.length >= minChunkSize) { DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(best.dimension); double domainMin = sortedData.getPair(0).getInput(best.dimension); double domainMax = sortedData.getPair(sortedData.size() 1).getInput(best.dimension); // Somehow our chunk covers the entire domain /* * It's actually possible for this to happen, although it is * extremely rare. If the training algorithm does not converge, but * its very last adjustment to the network made it perfect, this * case will occur. */ if (best.start == domainMin && best.end == domainMax) { return SplitType.unnecessary; } // Are we on the lower or upper edge? else if (best.start == domainMin || best.end == domainMax) { // Make sure there are at least two distinct values in the chunk if (countDistinctValuesInChunk(best.start, best.end, sortedData, best.dimension) < 2) { // Can't split (too few values) - try centroid return centroidSplit(n, r, d); } else { // split based on min/max List<DataRegion> splitRegions; ArrayList<Double> splitPoints = new ArrayList<Double>(); ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>(); ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>(); // lower half is solved if(best.start == domainMin) { splitRegions = r.split(best.dimension, best.end); unsolvedRegions.add(splitRegions.get(1)); solvedRegions.add(splitRegions.get(0)); splitPoints.add(best.end); //upper half is solved } else { splitRegions = r.split(best.dimension, best.start); 111 unsolvedRegions.add(splitRegions.get(0)); solvedRegions.add(splitRegions.get(1)); splitPoints.add(best.start); } // Remove existing network trainingMNetwork.removeNetwork(n); // Re-add solved region queueSolvedNetwork(n, solvedRegions); // create new network for the unsolved region queueUnsolvedNetworks(n, unsolvedRegions); // Create and dispatch split event HandleSplit(SplitType.chunk, unsolvedRegions, solvedRegions, splitPoints, best.length, best.dimension); return SplitType.chunk; } } // We're in the middle of the data set else { // Need at least three distinct values in the chunk if (countDistinctValuesInChunk(best.start, best.end, sortedData, best.dimension) < 3) { // Can't split - try centroid return centroidSplit(n, r, d); } else { // split data into [unsolved] [solved] [unsolved] ArrayList<Double> splitPoints = new ArrayList<Double>(); splitPoints.add(best.start); splitPoints.add(best.end); List<DataRegion> firstSplit = r.split(best.dimension, best.start); List<DataRegion> secondSplit = firstSplit.get(1).split(best.dimension, best.end); ArrayList<DataRegion> unsolvedRegions = new ArrayList<DataRegion>(); unsolvedRegions.add(firstSplit.get(0)); // low unsolved region unsolvedRegions.add(secondSplit.get(1)); // high unsolved region ArrayList<DataRegion> solvedRegions = new ArrayList<DataRegion>(); solvedRegions.add(secondSplit.get(0)); // central solved region // Remove existing network trainingMNetwork.removeNetwork(n); // Re-add solved region queueSolvedNetwork(n, solvedRegions); // create new networks for the unsolved upper / lower regions queueUnsolvedNetworks(n, unsolvedRegions); // Create and dispatch split event HandleSplit(SplitType.chunk, unsolvedRegions, solvedRegions, splitPoints, best.length, best.dimension); return SplitType.chunk; 112 } } // No chunk or chunk too small } else { sizeFailures++; return centroidSplit(n, r, d); } } /* * Splits Dataset across 2 neural networks with no solved regions. */ private SplitType centroidSplit(NeuralNetwork n, DataRegion r, DataSet d) throws NeuralNetworkException { // To select the dimension to split along, find the one with the most // distinct values int bestDimension = 0; int mostDistinctValues = 0; for (int inputNum = 0; inputNum < n.getNumInputs(); inputNum++) { DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(inputNum); int currDistinctValues = countDistinctValuesInChunk( sortedData.getPair(0).getInput(inputNum), sortedData.getPair(sortedData.size() - 1).getInput(inputNum), sortedData, inputNum); if (currDistinctValues > mostDistinctValues) { mostDistinctValues = currDistinctValues; bestDimension = inputNum; } } if (mostDistinctValues > 2) { DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(bestDimension); int splitPoint = mostDistinctValues / 2; // creating solved regions list (empty), unsolved regions, and split points list ArrayList<Double> splitPoints = new ArrayList<Double>(); splitPoints.add(sortedData.getPair(splitPoint).getInput(bestDimension)); List<DataRegion> unsolvedRegions = r.split(bestDimension, sortedData .getPair(splitPoint).getInput(bestDimension)); List<DataRegion> solvedRegions = new ArrayList<DataRegion>(); // removing original network - replaced by new split networks trainingMNetwork.removeNetwork(n); // create new networks for the unsolved upper / lower regions queueUnsolvedNetworks(n, unsolvedRegions); HandleSplit(SplitType.centroid, unsolvedRegions, solvedRegions, splitPoints, 0, bestDimension); return SplitType.centroid; } // No way to split return SplitType.impossible; 113 } /* * Count all distinct values in the dimension that fall between min / max */ private int countDistinctValuesInChunk(double minValue, double maxValue, DataSet sortedData, int chunkDimension) { int count = 0; double lastValue = Double.MIN_VALUE; for (DataPair p : sortedData) { double curValue = p.getInput(chunkDimension); if (curValue > maxValue) { break; } if (curValue >= minValue) { if (curValue != lastValue) { lastValue = curValue; count++; } } } return count; } /* * Finds largest solved chunk across all domains */ private Chunk findBestChunk(NeuralNetwork n, DataSet d) throws NeuralNetworkException { Chunk current; Chunk best = new Chunk(); // For each input dimension, find largest solved chunk for (int inputNum = 0; inputNum < d.getNumInputs(); inputNum++) { // Create a dataset which is sorted on the input dimension DataSet sortedData = (DataSet) d.clone(); sortedData.sortOnInput(inputNum); current = new Chunk(); // reset chunk for new dimension boolean failed; double lastFail = Double.MIN_VALUE; double lastGood = Double.MIN_VALUE; // track last edge that failed // track last edge that passed for (int pairNum = 0; pairNum < sortedData.size(); pairNum++) { DataPair currentPair = sortedData.getPair(pairNum); // fast-forward through points with same dimensional value if original failed // (don't want to split on a dimensional point with a mix of good/bad values) if(pairNum > 0 && current.length == 0 && currentPair.getInput(inputNum) == lastFail) { continue; } List<Double> outputs = n.getOutputs(currentPair.getInputs()); // Check each output for correctness against expected failed = false; 114 for (int outputNum = 0; outputNum < outputs.size(); outputNum++) { // output is incorrect, reset chunk if (Math.abs(outputs.get(outputNum) currentPair.getOutput(outputNum)) > acceptableErrors[outputNum]) { failed = true; break; } } if (failed) { lastFail = currentPair.getInput(inputNum); // only bother with the rest if currently have a chunk of passing points if(current.length > 0) { // if point has same dimensional value as last good point, then we can't use this point // must revert back to previous if(current.end == currentPair.getInput(inputNum)) { current.length--; current.end = lastGood; } // found a new best! if(current.length > best.length) { best = current; } current = new Chunk(); // reset chunk } } // output is correct, increment chunk (if not already added for that dimensional value) // only want 1 entry for each dimensional value else if(currentPair.getInput(inputNum) != current.end) { current.length++; lastGood = current.end; current.end = currentPair.getInput(inputNum); if(current.length == 1) { // this is a new chunk current.start = current.end; current.dimension = inputNum; } } } // extra check to make sure last chunk wasn't best if(current.length > best.length) { best = current; } } return best; } } // best should now hold best chunk values 115 BIBLIOGRAPHY [Fowler] A. Fowler. A Swing Architecture Overview. <http://java.sun.com/products/jfc/tsc/articles/architecture/>. Accessed 5/10/2010. [Gordon1] V. Gordon and J. Crouson. 2008. Self-Splitting Modular Neural Network - Domain Partitioning at Boundaries of Trained Regions. 2008 International Joint Conference on Neural Networks, Hong Kong. [Gordon2] V. Gordon. 2008. Neighbor Annealing for Neural Network Training. 2008 International Join Conference on Neural Networks, Hong Kong. [Gordon3] V. Gordon, M. Daniels, J. Boheman, M. Watstein, D.Goering, and B. Urban. 2009. Visualization Tool for a Self-Splitting Neural Network. 2009 International Join Conference on Neural Networks, Atlanta GA. [Gordon4] T. Bender, V. Gordon, and M. Daniels. 2009. Partitioning Strategies for Modular Neural Networks. 2009 International Joint Conference on Neural Networks (IJCNN 2009), Atlanta GA. [Hu] X. Hu. Particle Swarm Optimization: Tutorial. 2006. <http://www.swarmintelligence.org/tutorials.php>. Accessed 6/1/2008. [Lu] X. Lu, N. Bourbakis. 1998. IEEE Internation Join Symposia on Intelligence and Systems. [Rojas] R. Rojas. Neural Networks - A Systematic Introduction. Springer, 1996. [Russell] S. Russell, P. Norvig. Artificial Intelligence - A Modern Approach. New Jersey, 2003.