NEURAL NETWORK FOR INTELLIGENT SYSTEM MONITORING Ameya Shrikant Deo B.E, Pune University, India, 2006 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE at CALIFORNIA STATE UNIVERSITY, SACRAMENTO SPRING 2012 NEURAL NETWORK FOR INTELLIGENT SYSTEM MONITORING A Project by Ameya Shrikant Deo Approved by: __________________________________, Committee Chair V. Scott Gordon, Ph.D. __________________________________, Second Reader Martin Nicholes, Ph.D. ____________________________ Date ii Student: Ameya Shrikant Deo I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the Project. __________________________, Graduate Coordinator Nikrouz Faroughi, Ph.D. Department of Computer Science iii ________________ Date Abstract of NEURAL NETWORK FOR INTELLIGENT SYSTEM MONITORING by Ameya Shrikant Deo This project focuses on introducing intelligent algorithm in a commercially available monitoring solution. Typical monitoring software does not have any capabilities of generating intelligent alerts. To overcome this limitation, a multilayered artificial neural network is added in the architecture of the monitoring system. Historical data from the monitoring application’s database is used to train the neural network. A trained neural network is then tested against the new data received and intelligent alerts are generated to predict the future behavior the system. These alerts are forwarded to the event management and reporting tools which ultimately notifies system administrator about the intelligent alerts. ______________________, Committee Chair V. Scott Gordon, Ph.D. _______________________ Date iv ACKNOWLEDGEMENTS I offer my sincere gratitude to my project advisor Dr. Scott Gordon for helping me choose my project topic and for the time he spent with me during the entire project implementation. I would also like to thank him for guiding and correcting my project report. A special thanks to Dr Martin Nicholes for reviewing my project report. I express my thanks to the computer science department and all the faculty members without whom this project would have been a distant reality. I also extend my heartiest thanks to my family and my well wishers. v TABLE OF CONTENTS Page Acknowledgements……………………………………………………………………......v List of tables…………………………………………………………………………….viii List of figures…………………………………………………………………………….ix Chapter 1. INTRODUCTION……………………………………………………………………...1 2. BACKGROUND……………………………………………………..……………….. 3 2.1 BMC performance manager…………………………………………………...3 2.1.1 Patrol architecture………………………………………………...3 2.1.1.1 Patrol agent……………………………………………..4 2.1.1.2 Patrol knowledge modules (KM)……………………….4 21.1.3 Patrol console…………………………………….……...5 2.1.2 Patrol scripting language (PSL)…………………………………..5 2.1.3 Event management and reporting utility………………………….6 2.2 Artificial neural networks……………………………………………………..7 2.2.1 Artificial neurons…………………………………………………9 2.2.2 Backpropagation algorithm……………………………………...13 3. PATROL MONITORING WITH INTEGRATED NEURAL NETWORK….………16 4. RESULTS……………………………………………………………………………..23 vi 5. CONCLUSION.………………………………………………………………………28 6. FUTURE WORK……………………………………………………………………...30 Appendix A Source code.……………………………………………………………...31 Appendix B Dataset……………………………………………………………………40 Appendix C Results……………………………………………………………………41 Bibliography……………………………………………………………………………..44 vii LIST OF TABLES Table Page Table 1 Parameter Border Ranges……………………………………………………23 Table 2 HAAGPerformancePrediction Border Ranges………………………………24 Table 3 Parameter values for a non busy system……………………………………..25 Table 4 Parameter values for a moderately busy system……………………………..26 Table 5 Parameter values for a busy system………………………………………….27 viii LIST OF FIGURES Figure Page Figure 1 Patrol Architecture…………………………………………………………..3 Figure 2 Event Management and Reporting Utility…………………………………..6 Figure 3 Artificial Neural Network…………………………………………………...8 Figure 4 Artificial Neuron with Activation Function………………………………...10 Figure 5 Activation function………………………………………………………….12 Figure 6 Backpropagation Algorithm………………………………………………...14 Figure 7 Patrol Monitoring with introduction of Artificial Neural Network…………16 Figure 8 Patrol with neural network and event management and reporting utility …..17 Figure 9 Example of Training Data…………………………………………………...18 Figure 10 Addition of a new parameter HAAGPerformancePrediction in Unix KM ....18 Figure 11 Border Ranges and PSL code for HAAGPerformancePrediction …..………19 Figure 12 HAAGPerformance Prdiction with values 1 and 0 ..………………………..20 Figure 13 BMC Event Manager(BEM) received event for parameter HAAGPerformancePrediction …………………………………………….. 21 Figure 14 Remedy ticket for parameter HAAGPerformancePrediction ……………….22 ix 1 Chapter 1 INTRODUCTION A typical commercial monitoring system collects and stores performance data for the system or applications to be monitored. It checks whether the collected data is within the acceptable ranges and notifies the user if the data falls outside the acceptable limits set within the system. These systems are designed just to catch and report errors as and when they appear. They lack in intelligence and generally don’t have any capabilities to predict the future behavior of the system looking at the data stored within the monitoring application. Introduction of neural network can overcome this limitation. Neural networks can be trained with the data collected by the monitoring system and then it can be used to predict the future behavior of the monitored application/system. This project introduces neural network for adding intelligence in the commercially available monitoring solution BMC Patrol. There are multiple performance statistics available in BMC Patrol. This project focuses on extracting data from monitoring solution’s local database, uses this data to train a multilayered artificial neural network and then uses the neural network to predict whether important system resources could be overused in near future. 2 There are multiple intelligent monitoring systems available in market for commercial usage. But these systems introduce intelligence in monitoring during later phases which are event processing and ticket generation. These systems do not introduce artificial intelligence at the source of data collection. Introduction of Neural Networks for processing patrol data can overcome limitations of these systems. The main idea behind the design of new system is; a Neural Network is trained with patrol history data and its output is tested against the new data which patrol collects at every polling interval. If the Neural Network predicts that the system resources can be overused in coming future, then an alarm is generated. This alarm is forwarded to the event processing engine and is then forwarded to the remedy tool for ticket generation. 3 Chapter 2 BACKGROUND 2.1 BMC Performance Manager One of the major tasks of a system administrator includes monitoring vital resources of the system. There are multiple monitoring solutions available in software market for commercial usage. BMC Performance Manager, commonly known as BMC Patrol, is the leading software vendor in the field of software monitoring. BMC Patrol is an event management tool which provides an environment for monitoring the status of every vital resource in a distributed environment. It generates events on the basis of predefined abnormalities, forwards them to event processing engine (BMC Event Manager), generates remedy tickets (BMC Remedy tool) and assigns these tickets to the appropriate team. 2.1.1 Patrol Architecture Figure 1: Patrol Architecture [1] 4 Main components of BMC Patrol are: Patrol Agent, Patrol Knowledge Modules (KM) and Patrol console. 2.1.1.1 Patrol Agent Patrol agent is a core piece of Patrol Architecture which monitors and manages the host computer. It runs as a process on the host and collects the system/application data to be monitored. Patrol agent performs following tasks: i. Runs commands to collect system or application information ii. Stores information locally in patrol history datastore for future retrieval iii. Provides services for event management iv. Loads knowledge modules and Patrol Scripting Language (PSL) library files 2.1.1.2 Patrol Knowledge Modules (KM) Patrol KM is a set of files from which Patrol Agent receives information about all the resources. KM directs Patrol Agent to collect the data to be monitored. Different knowledge modules are available for different applications/resource monitoring e.g. databases, file system etc. Patrol KM directs Patrol Agent about: i. How to monitor the application ii. What parameters to be monitored iii. How to identify objects to be monitored iv. Defining the state of objects discovered 5 iv. Action to take when an object changes state 2.1.1.3 Patrol Console Patrol Console is a graphical user interface (GUI) for working with patrol architecture. Patrol Console displays all the monitored hosts and applications. Tasks can be performed with the Patrol Console are: i. Monitor and manage computers and applications through Patrol Agent and Patrol KM ii. View Parameter data iii. Build new KMs and Customize existing KMs iv. Start/Stop remote Patrol Agents 2.1.2 Patrol Scripting Language (PSL) PSL is an interpreted and a compiled language for writing complex application discovery procedures, parameters and commands within the Patrol environment. It is a BMC proprietary platform independent scripting language. PSL has its own virtual machine which compiles the source code into machine language. Compiled PSL code is converted into library files (.lib) which can be invoked directly from Patrol Knowledge Modules. 6 2.1.3 Event Management and Reporting Utility Patrol identifies abnormalities with the application or an operating system. But the commercial usage of patrol demands processing, customizing and reporting ability of these abnormalities. BMC Event Manager (BEM) receives events from Patrol, processes events and forwards them to the reporting engine - BMC Remedy Tool. Figure 2: Event Management and Reporting Utility [1] Patrol agent collects application performance data. It checks if the collected data is within the boundary limits set by the knowledge module. If the data crosses the limit, Patrol generates an abnormality event and forwards it to BMC Event Manager. BEM process event and checks for any customization required before forwarding it to Remedy. Remedy is a ticketing tool. It generates an incident ticket for the event and assigns it to 7 the concerned team with the organization. More information about these products can be found on BMC software’s official web site. [BMCCS] 2.2 Artificial Neural Networks Artificial neural networks process information similar to the way biological neurons process information in brain’s Central Nervous System. The main element of artificial neural network is the information processing system which is composed of highly interconnected processing elements called artificial neurons. Neurons work in unison to solve a specific problem. It is an adaptive system which changes its structure according to the information which flows through the system. A neural network is trained with a set of training data so that desired set of inputs produces desired set of outputs. Computing model for artificial neural networks was first proposed by McCulloch and Pitts in 1940’s. In 1950’s, Rosenblatt introduced two layered network which was capable of learning certain classifications by adjusting connection weights. But this model could only solve very simple problems which lead the study of this field into uncertainty. However this model built foundation for the later work in neural computing. In 1980’s methods for training multiple layers neural networks were developed. Multiple layer neural networks can solve more complex problems which resulted in today’s commercial use of neural networks. 8 Figure 3: Artificial Neural Network Figure 3 represents basic diagram of artificial neural network. It consists of three main layers: Input Layer: Input to the system is fed into the input layer. Input is multiplied by interconnection weight and is passed to the next layer. One or more hidden layers: Computation of the system is performed in hidden layers. Weights are adjusted and are passed to the next layer for computation Output Layer: Output of the system is calculated in output layer. Weights: Strength of connection between the nodes 9 Neural networks learn by example. A set of training data including both inputs and outputs is provided to the network. The network adjusts weights within neurons with the help of training algorithms to get the desired output. Multiple iterations of training algorithm are required to achieve the desired output. A network can be considered as trained once it reaches the desired output from the set of inputs. Once the network is trained, it is tested against a separate set of test data to know whether the learning process was successful or not and if the network generalizes to other cases. During testing, the network uses values of weights already calculated in training. Artificial neural networks offer an analytical alternative over conventional techniques of problem solving. They have an ability to learn how to do tasks based on data given for training. They can be applied to a large variety of domains and require relatively less formal statistical training than classical modeling techniques. 2.2.1. Artificial Neurons Artificial neuron is an abstraction of biological neuron. It is a basic building block of artificial neural network. A set of inputs is applied to the input layer each representing output of another neuron. Each input is multiplied by a corresponding weight and all the weighted inputs are summed to determine the activation level of neuron. This weighted sum is calculated for each neuron in the network and is termed as NET. The NET is then 10 given to the activation function there by producing OUT signal. Figure 4 represents neuron with activation function. Figure 4: Artificial Neuron with Activation Function As shown in above diagram, inputs given to the network are X1,X2..Xn. These inputs are multiplied by their corresponding weights W1,W2..Wn and are summed to produce NET. So, NET = X1 W1 + X2 W2 +…. + XnWn In vector form: NET = XW 11 The NET signal is then processed by the activation function thereby producing OUT signal. In a threshold activation function: OUT = 1 if NET > T OUT = 0 otherwise The threshold T is the constant which simulates non linear characteristics of biological neurons. It is a more common practice to use a continuous function termed as squashing function instead of a hard threshold values. This function limits the range of NET so that OUT signal never exceeds some low limit regardless of value of NET. Generally this function is sigmoid meaning S shaped and is represented as: OUT = 1 / (1 + e – NET ) The above function is called squashing function because it compresses the range of output and makes sure that the output never exceed some predefined limits. Because of the squashing function, the output of neural network lies between 0 and 1. Figure 4 shows sigmoid activation function. 12 Figure 5: Activation function A neural network is trained so that the application of certain inputs produces desired set of outputs. Training process involves adjusting of weights of neurons to produce the desired set of outputs. There are various training algorithms available to train artificial neural networks. One of the most popular training algorithms available is Backpropagation algorithm 13 2.2.2 Backpropagation Algorithm Backpropagation is one of the most common and popular method of training artificial neural networks. It is a multistage dynamic system optimization method used for supervised training. In supervised training, each input vector is associated with a target vector representing the desired output. A network is trained over a number of training pairs. An input vector is applied and output of the network is calculated. Output is then compared with the target vector and the difference is calculated. This difference is called error and is fed through the network. The error is then statistically used to modify the weights to bring the outputs closer to the expected output. Backpropagation algorithm is an iterative method. It is mainly applied to feed forward neural networks. In each iteration, the weights are calculated using the data from training data sets. These calculations are then forwarded in the network and the errors are propagated backward in the network. The main motive of the algorithm is to minimize errors by adjusting weights amongst the nodes. Generally a set of training data is applied to train the network and test data is used to test how well trained network is generalized. 14 Actual algorithm for a three layered network can be summarized as following: Figure 6: Backpropagation Algorithm [WIKIBP] Formula for calculating delta from hidden layer to the output layer is: delta = O (1-O) (T-O) In the forward pass of the algorithm, neuron from hidden layer propagates values of its weights to the output layer. Values for delta are propagated backward from output layer to the hidden layer during reverse pass of the algorithm. The important factor in calculating delta value for an artificial neuron p in hidden layer j just before the output layer k, is multiplication of delta values from hidden layer with its corresponding weight. The delta value for a neuron in hidden layer is calculated by summing all such multiplication and in mathematical terms can be expressed as: delta p,j = O p,j (1 - O p,j) (∑ delta q,k weight pq,k ) q 15 The main advantage of using Backpropagation algorithm is its ability to solve complex problems when trained properly. The limitation of the approach is learning is not guaranteed and convergence obtained can be very slow. 16 Chapter 3 PATROL MONITORING WITH INTEGRATED NEURAL NETWORK As mentioned earlier in Chapter 2, a classic monitoring tool can be made intelligent with the introduction of artificial neural network at the phase of data collection. Data collected from patrol can be fed to neural network for its training. A trained neural network can be used for generating intelligent alerts like predicting future behavior of the system, generating alarms/warnings well before the system thresholds can go beyond predefined ranges. This alarm/warning can then be forwarded to the event processing, ticket generation tools to generate alarms and notify operators about the system behavior. Figure 7 depicts basic architecture diagram of patrol with addition of artificial neural network Figure 7: Patrol Monitoring with introduction of Artificial Neural Network [1] 17 Figure 8 shows intelligent patrol monitoring with event management and reporting utility Figure 8: Patrol with neural network and event management and reporting utility [1] Unix KM is used for monitoring Unix servers. It has different parameters for monitoring different operating systems resources. These parameters collect and store data in patrol history file every 75 seconds. In our design, some of the most important parameters like CPU Utilization, Memory Utilization and File system Usage are used to train neural network. Data from patrol history for these parameters is extracted and is provided as an input data to the network. The network trains itself with this data and predicts the future behavior of the system. Static data is used for training neural network. Figure 9 shows an example of training data. Actual training data can be found in Appendix B. 18 Figure 9: Example of Training Data A new parameter named HAAGPerformancePrediction is added in Unix KM. This parameter has a polling interval of 1 minute that means it collects data every 60 seconds. Figure 10: Addition of a new parameter HAAGPerformancePrediction in Unix KM 19 HAAGPerformancePrediction collects data for CPU Utilization, Memory Utilization and File System Capacity for two file systems /Root and /OPT-maestro at every 1 minute. These values are provided by default to patrol by Unix KM. This new parameter then writes these values in a file called TestingData.dat. Next step in the process is to provide values from TestingData.dat file as a test data for the neural network. An output of already trained neural network is tested against the values from this file and result is written to output.dat file. This result is parsed by HAAGPerformancePrediction parameter. Border ranges defined for this parameter are: 0 = OK 1 = Alarm So if the value written in output.dat is 0, no warning or alarm is generated. If the value written in output.dat file is 1, then an alarm is generated. Figure 11: Border Ranges and PSL code for HAAGPerformancePrediction 20 HAAGPerformancePrediction checks for the value written in output.dat file. It generates an alarm if the value is 1. Figure 12 depicts HAAGPerformancePrediction with values 1 and 0. Figure 12: HAAGPerformancePrediction with values 1 and 0 Alarms generated from patrol are forwarded to an event processing engine where these events are enriched and modified according to user requirements. The modified events are then forwarded to a remedy tool which generates a ticket and assigns it to the operator from the team who manages this server. 21 Figure 13 depicts diagram showing event management tool which received a critical event (ALARM) for HAAGPerformance prediction. Figure 13: BMC Event Manager(BEM) received event for parameter HAAGPerformancePrediction Figure 14 shows remedy tool which generated a ticket for the event received from BMC Event Manager for HAAGPerformancePrediction. 22 Figure 14: Remedy ticket for parameter HAAGPerformancePrediction 23 Chapter 4 RESULTS Values for parameters CPU Utilization, Memory Usage, File System Capacity (for file system opt-maestro) and File System Capacity (for file system root) are extracted from patrol database and are used as a training data to train artificial neural network. Patrol sets border ranges for these parameters which mean patrol decides whether the parameter is in OK, Warning or Alarm state. Border ranges for above parameters are given in table 1. Table 1: Parameter Border Ranges Parameter Name OK Warning Alarm CPUCpuUtil 0 – 90 90 – 95 95 – 100 MEMUsedMemPerc 0 – 90 90 – 95 95 – 100 FSCapacity (OPT-Maestro) 0 – 96 96 – 98 98 – 100 FSCapacity (Root) 0 – 96 96 – 98 98 – 100 The training set includes combination of parameter values where some values are in OK state, some are in Warning state and some are in Alarm state. Complete set of training dataset can be found in Appendix B and details of results can be found in Appendix C. 24 Table 2 shows border ranges for new parameter HAAGPerformacePrediction. Table 2: HAAGPerformancePrediction Border Ranges Parameter Name HAAGPerformancePrediction Output of neural OK Warning Alarm 0–0 - 1–1 network determines value of new parameter HAAGPerformancePrediction. This new parameter should be in OK state if patrol parameters for CPU, Memory and File System are all in OK state and are well below their warning ranges. HAAGPerformancePrediction should be in Alarm when all or few patrol parameters are in alarm or if values of patrol parameters are close to their warning ranges. To determine behavior of the new parameter under different circumstances, it was tested against three different scenarios. 25 Case 1: When system was not busy In this case, intelligent monitoring was tested for a non busy system. On this server, patrol parameter values for CPU utilization, memory usage and both file system capacities were well below their warning ranges. These values were provided as a test data to neural network and the results were observed. HAAGPerformancePrediction parameter had a value 0 and the parameter was in OK state which means result of this case was successful. Table 3: Parameter values for a non busy system Parameter Name Parameter Value (%) Parameter State CPUCpuUtil 3.383 OK MEMUsedMemPerc 27.288 OK FSCapacity (OPT-Maestro) 3.542 OK FSCapacity (Root) 56.19 OK 0 OK HAAGPerformancePrediction 26 Case 2: When system was moderately busy In this case, intelligent monitoring was tested for a moderately busy system. On this server, patrol parameter values for CPU utilization and memory usage were close to their warning ranges. Values for both file system capacity parameters were same as in case 1. These values were provided as a test data to neural network and the results were observed. Parameter HAAGPerformancePrediction had a value 1. The parameter was in alarm. As per the design, the intelligent parameter should be in alarm if few input parameters are near to their warning or alarm ranges. So the result of this case was successful. The alarm generated an event which was forwarded to the BMC event manager. BMC event manager captured the event and sent it to the remedy ticketing tool for ticket creation. Table 4: Parameter values for a moderately busy system Parameter Name Parameter Value (%) Parameter State CPUCpuUtil 83.00 OK MEMUsedMemPerc 79.00 OK FSCapacity (OPT-Maestro) 3.542 OK FSCapacity (Root) 56.19 OK 1 Alarm HAAGPerformancePrediction 27 Case 3: When system was busy In this case, intelligent monitoring was tested for a busy system. On this server, patrol parameter value for CPU utilization was in alarm and value for memory usage was close to its warning range. Values for both file system capacity parameters were well below their warning ranges. These values were provided as a test data to neural network and the results were observed. Parameter HAAGPerformancePrediction had a value 1 and the parameter was in alarm. The intelligent parameter should be in alarm if any of its input parameters is in Alarm. Hence the result of this test was successful. Alarm generated by HAAGPerformancePrediction sent an event to the BMC event manager. BMC event manager forwarded this event to the ticket generation engine to generate a ticket. Table 5: Parameter values for a busy system Parameter Name Parameter Value (%) Parameter State CPUCpuUtil 100.00 Alarm MEMUsedMemPerc 85.00 OK FSCapacity (OPT-Maestro) 25.50 OK FSCapacity (Root) 40.89 OK 1 Alarm HAAGPerformancePrediction 28 Chapter 5 CONCLUSION Artificial neural networks learn by example and with a good training dataset they can be trained to be used in practical applications. For real world applications, training data for neural networks is scattered along a large scale. This data has to be managed effectively to make the learning process of the neural network faster. Backpropagation is one of the most effective and easy training algorithm for training artificial neural networks. BMC Patrol is one of the most used monitoring solutions available for commercial usage. It uses classic method for monitoring i.e. it just captures and report errors as and when they appear. Its usage can be extended with the introduction of intelligent algorithms. With the introduction of intelligence in monitoring at the source level, it is possible to predict whether the important system resources will be over used or not in near time. An artificial neural network was integrated into BMC Patrol application and trained on data from patrol history database. This hybrid system was able to predict whether the important system resources could be over used in near future. This added a new monitoring parameter in BMC Patrol and had a capability to generate alarms and 29 notify system administrator about the possible over use of important resources in coming time. 30 Chapter 6 FUTURE WORK There is a lot of scope for improving this project from here on. This project only considers four important patrol parameters for training the neural network. From administration’s point of view, there are many other vital resources which need to be considered. A usage of neural network can be extended to consider these other important system resources. Scope of this project is only limited to adding only one intelligent parameter in patrol monitoring. Other new intelligent parameters with different functionalities can be added to make the monitoring more efficient, such as parameters to create baselines for system resource utilization by specific applications etc. This project only considers adding neural network at the source level of data collection. Similarly, usage of neural networks can be extended in event management and ticket generation phases. A neural network can effectively filter duplicate events and can stop alerting on false alarms. Also, more effective and faster ways of training the neural network could be tested, such as Particle Swarm Optimization or Quickpropagation. 31 APPENDIX A Source code /************************************************** Neural Network with Backpropagation -------------------------------------------------Adapted from D. Whitley, Colorado State University Modifications by S. Gordon Modified by Ameya Deo to integrate it with BMC Patrol, March 2012 -------------------------------------------------Version 3.1 - October 2010 - criteria bug fix -------------------------------------------------compile with g++ nn.c ****************************************************/ #include <iostream> #include <fstream> #include <cmath> #include <cstdlib> using namespace std; #define NumOfCols 3 /* number of layers +1 i.e, include input layer */ #define NumOfRows 5 /* max number of rows net +1, last is bias node */ #define NumINs 4 /* number of inputs, not including bias node */ #define NumOUTs 1 /* number of outputs, not including bias node */ #define LearningRate 0.2 /* most books suggest 0.3 */ #define Criteria 0.05 /* all outputs must be within this to terminate */ #define TestCriteria 0.1 /* all outputs must be within this to generalize */ #define MaxIterate 100000 /* maximum number of iterations */ #define ReportIntv 101 /* print report every time this many cases done*/ #define Momentum 0.9 /* momentum constant */ #define TrainCases 49 /* number of training cases */ #define TestCases 1 /* number of test cases */ // network topology by column -----------------------------------#define NumNodes1 5 /* col 1 - must equal NumINs+1 */ #define NumNodes2 5 /* col 2 - hidden layer 1, etc. */ #define NumNodes3 1 /* output layer must equal NumOUTs */ #define NumNodes4 0 /* */ #define NumNodes5 0 /* note: layers include bias node */ #define NumNodes6 0 #define TrainFile "/tmp/asd/TrainingData.dat" /* file containing training data */ #define TestFile "/tmp/asd/TestingData1.dat" /* file containing testing data */ #define OpFile "/tmp/asd/Output.dat" int NumRowsPer[NumOfRows]; /* /* /* /* number note note note - of rows used in each column incl. bias bias is not included on output layer leftmost value must equal NumINs+1 rightmost value must equal NumOUTs double TrainArray[TrainCases][NumINs + NumOUTs]; double TestArray[TestCases][NumINs + NumOUTs]; int CritrIt = 3 * TrainCases; */ */ */ */ 32 ifstream train_stream; ifstream test_stream; ofstream output_stream; /* source of training data */ /* source of test data */ /* output file */ void CalculateInputsAndOutputs (); void TestInputsAndOutputs(); void TestForward(); double ScaleOutput(double X, int which); double ScaleDown(double X, int which); void GenReport(int Iteration); void TrainForward(); void FinReport(int Iteration); void DumpWeights(); struct CellRecord { double Output; double Error; double Weights[NumOfRows]; double PrevDelta[NumOfRows]; }; struct double double double long CellRecord CellArray[NumOfRows][NumOfCols]; Inputs[NumINs]; DesiredOutputs[NumOUTs]; extrema[NumINs+NumOUTs][2]; // [0] is low, [1] is hi Iteration; /************************************************************ Get data from Training and Testing Files, put into arrays *************************************************************/ void GetData() { for (int i=0; i < (NumINs+NumOUTs); i++) { extrema[i][0]=99999.0; extrema[i][1]=-99999.0; } // read in training data train_stream.open(TrainFile); for (int i=0; i < TrainCases; i++) { for (int j=0; j < (NumINs+NumOUTs); j++) { train_stream >> TrainArray[i][j]; if (TrainArray[i][j] < extrema[j][0]) extrema[j][0] = TrainArray[i][j]; if (TrainArray[i][j] > extrema[j][1]) extrema[j][1] = TrainArray[i][j]; } } train_stream.close(); // read in test data test_stream.open(TestFile); for (int i=0; i < TestCases; i++) { for (int j=0; j < (NumINs+NumOUTs); j++) { test_stream >> TestArray[i][j]; if (TestArray[i][j] < extrema[j][0]) extrema[j][0] = TestArray[i][j]; if (TestArray[i][j] > extrema[j][1]) extrema[j][1] = TestArray[i][j]; } } // guard against both extrema being equal for (int i=0; i < (NumINs+NumOUTs); i++) if (extrema[i][0] == extrema[i][1]) extrema[i][1]=extrema[i][0]+1; test_stream.close(); 33 // scale training and test data to range 0..1 for (int i=0; i < TrainCases; i++) for (int j=0; j < NumINs+NumOUTs; j++) TrainArray[i][j] = ScaleDown(TrainArray[i][j],j); for (int i=0; i < TestCases; i++) for (int j=0; j < NumINs+NumOUTs; j++) TestArray[i][j] = ScaleDown(TestArray[i][j],j); } /************************************************************** Assign the next training pair ***************************************************************/ void CalculateInputsAndOutputs() { static int S=0; for (int i=0; i < NumINs; i++) Inputs[i]=TrainArray[S][i]; for (int i=0; i < NumOUTs; i++) DesiredOutputs[i]=TrainArray[S][i+NumINs]; S++; if (S==TrainCases) S=0; } /************************************************************** Assign the next testing pair ***************************************************************/ void TestInputsAndOutputs() { static int S=0; for (int i=0; i < NumINs; i++) Inputs[i]=TestArray[S][i]; for (int i=0; i < NumOUTs; i++) DesiredOutputs[i]=TestArray[S][i+NumINs]; S++; if (S==TestCases) S=0; } /************************* MAIN *************************************/ int main() { int I, J, K, existsError, ConvergedIterations=0; long seedval; double Sum, newDelta; Iteration=0; NumRowsPer[0] = NumNodes1; NumRowsPer[1] = NumNodes2; NumRowsPer[2] = NumNodes3; NumRowsPer[3] = NumNodes4; NumRowsPer[4] = NumNodes5; NumRowsPer[5] = NumNodes6; /* initialize the weights to small random values. */ /* initialize previous changes to 0 (momentum). */ seedval = 555; srand(seedval); for (I=1; I < NumOfCols; I++) for (J=0; J < NumRowsPer[I]; J++) for (K=0; K < NumRowsPer[I-1]; K++) { CellArray[J][I].Weights[K] = 2.0 * ((double)((int)rand() % 100000 / 100000.0)) - 1.0; CellArray[J][I].PrevDelta[K] = 0; } GetData(); // read training and test data into arrays 34 cout << endl << "Iteration cout << "Desired Outputs Inputs "; Actual Outputs" << endl; // ------------------------------// beginning of main training loop do { /* retrieve a training pair */ CalculateInputsAndOutputs(); for (J=0; J < NumRowsPer[0]-1; J++) CellArray[J][0].Output = Inputs[J]; /* set up bias nodes */ for (I=0; I < NumOfCols-1; I++) { CellArray[NumRowsPer[I]-1][I].Output = 1.0; CellArray[NumRowsPer[I]-1][I].Error = 0.0; } /************************** * FORWARD PASS * **************************/ /* hidden layers */ for (I=1; I < NumOfCols-1; I++) for (J=0; J < NumRowsPer[I]-1; J++) { Sum = 0.0; for (K=0; K < NumRowsPer[I-1]; K++) Sum += CellArray[J][I].Weights[K] * CellArray[K][I-1].Output; CellArray[J][I].Output = 1.0 / (1.0+exp(-Sum)); CellArray[J][I].Error = 0.0; } /* output layer */ for (J=0; J < NumOUTs; J++) { Sum = 0.0; for (K=0; K < NumRowsPer[NumOfCols-2]; K++) Sum += CellArray[J][NumOfCols-1].Weights[K] * CellArray[K][NumOfCols-2].Output; CellArray[J][NumOfCols-1].Output = 1.0 / (1.0+exp(-Sum)); CellArray[J][NumOfCols-1].Error = 0.0; } /************************** * BACKWARD PASS * **************************/ /* calculate error at each output node */ for (J=0; J < NumOUTs; J++) CellArray[J][NumOfCols-1].Error = DesiredOutputs[J]-CellArray[J][NumOfCols-1].Output; /* check to see how many consecutive oks seen so far */ existsError = 0; for (J=0; J < NumOUTs; J++) if (fabs(CellArray[J][NumOfCols-1].Error) > (.9*Criteria/(extrema[NumINs+J][1]-extrema[NumINs+J][0]))) existsError = 1; if (existsError == 0) ConvergedIterations++; else ConvergedIterations = 0; /* apply derivative of squashing function to output errors */ for (J=0; J < NumOUTs; J++) CellArray[J][NumOfCols-1].Error = 35 CellArray[J][NumOfCols-1].Error * CellArray[J][NumOfCols-1].Output * (1.0 - CellArray[J][NumOfCols-1].Output); /* backpropagate error */ /* output layer */ for (J=0; J < NumRowsPer[NumOfCols-2]; J++) for (K=0; K < NumRowsPer[NumOfCols-1]; K++) CellArray[J][NumOfCols-2].Error = CellArray[J][NumOfCols-2].Error + CellArray[K][NumOfCols-1].Weights[J] * CellArray[K][NumOfCols-1].Error * (CellArray[J][NumOfCols-2].Output) * (1.0-CellArray[J][NumOfCols-2].Output); /* hidden layers */ for (I=NumOfCols-3; I>=0; I--) for (J=0; J < NumRowsPer[I]; J++) for (K=0; K < NumRowsPer[I+1]-1; K++) CellArray[J][I].Error = CellArray[J][I].Error + CellArray[K][I+1].Weights[J] * CellArray[K][I+1].Error * (CellArray[J][I].Output) * (1.0-CellArray[J][I].Output); /* adjust weights */ for (I=1; I < NumOfCols; I++) for (J=0; J < NumRowsPer[I]; J++) for (K=0; K < NumRowsPer[I-1]; K++) { newDelta = (Momentum * CellArray[J][I].PrevDelta[K]) + (LearningRate * CellArray[K][I-1].Output * CellArray[J][I].Error); CellArray[J][I].Weights[K] = CellArray[J][I].Weights[K] + newDelta; CellArray[J][I].PrevDelta[K] = newDelta; } GenReport(Iteration); Iteration++; } while (!((ConvergedIterations >= CritrIt) || (Iteration >= MaxIterate))); // end of main training loop // ------------------------------FinReport(ConvergedIterations); TrainForward(); TestForward(); return(0); } /******************************************* Scale Desired Output to 0..1 *******************************************/ double ScaleDown(double X, int which) { double allPos; allPos = .9*(X-extrema[which][0])/(extrema[which][1]-extrema[which][0])+.05; return (allPos); } /******************************************* Scale actual output to original range *******************************************/ double ScaleOutput(double X, int which) { 36 double range = extrema[which][1] - extrema[which][0]; double scaleUp = ((X-.05)/.9) * range; return (extrema[which][0] + scaleUp); } /******************************************* Run Test Data forward pass only *******************************************/ void TestForward() { int GoodCount=0; double Sum, TotalError=0; cout << "Running Test Cases" << endl; for (int H=0; H < TestCases; H++) { TestInputsAndOutputs(); for (int J=0; J < NumRowsPer[0]-1; J++) CellArray[J][0].Output = Inputs[J]; /* hidden layers */ for (int I=1; I < NumOfCols-1; I++) for (int J=0; J < NumRowsPer[I]-1; J++) { Sum = 0.0; for (int K=0; K < NumRowsPer[I-1]; K++) Sum += CellArray[J][I].Weights[K] * CellArray[K][I-1].Output; CellArray[J][I].Output = 1.0 / (1.0+exp(-Sum)); CellArray[J][I].Error = 0.0; } /* output layer */ for (int J=0; J < NumOUTs; J++) { Sum = 0.0; for (int K=0; K < NumRowsPer[NumOfCols-2]; K++) Sum += CellArray[J][NumOfCols-1].Weights[K] * CellArray[K][NumOfCols-2].Output; CellArray[J][NumOfCols-1].Output = 1.0 / (1.0+exp(-Sum)); CellArray[J][NumOfCols-1].Error = DesiredOutputs[J]-CellArray[J][NumOfCols-1].Output; if (fabs(CellArray[J][NumOfCols-1].Error) <= (.9*TestCriteria/(extrema[NumINs+J][1]-extrema[NumINs+J][0]))) GoodCount++; TotalError += CellArray[J][NumOfCols-1].Error * CellArray[J][NumOfCols-1].Error; } GenReport(-1); } cout << endl; cout << "Sum Squared Error for Testing cases = " << TotalError << endl; cout << "% of Testing Cases that meet criteria = " << ((double)GoodCount/(double)TestCases); output_stream.open(OpFile); if ((double)GoodCount/(double)TestCases == 1 ) output_stream << "0"; else output_stream << "1"; output_stream.close(); cout << endl; cout << endl; } 37 /***************************************************** Run Training Data forward pass only, after training ******************************************************/ void TrainForward() { int GoodCount=0; double Sum, TotalError=0; cout << endl << "Confirm Training Cases" << endl; for (int H=0; H < TrainCases; H++) { CalculateInputsAndOutputs (); for (int J=0; J < NumRowsPer[0]-1; J++) CellArray[J][0].Output = Inputs[J]; /* hidden layers */ for (int I=1; I < NumOfCols-1; I++) for (int J=0; J < NumRowsPer[I]-1; J++) { Sum = 0.0; for (int K=0; K < NumRowsPer[I-1]; K++) Sum += CellArray[J][I].Weights[K] * CellArray[K][I-1].Output; CellArray[J][I].Output = 1.0 / (1.0+exp(-Sum)); CellArray[J][I].Error = 0.0; } /* output layer */ for (int J=0; J < NumOUTs; J++) { Sum = 0.0; for (int K=0; K < NumRowsPer[NumOfCols-2]; K++) Sum += CellArray[J][NumOfCols-1].Weights[K] * CellArray[K][NumOfCols-2].Output; CellArray[J][NumOfCols-1].Output = 1.0 / (1.0+exp(-Sum)); CellArray[J][NumOfCols-1].Error = DesiredOutputs[J]-CellArray[J][NumOfCols-1].Output; if (fabs(CellArray[J][NumOfCols-1].Error) <= (.9*Criteria/(extrema[NumINs+J][1]-extrema[NumINs+J][0]))) GoodCount++; TotalError += CellArray[J][NumOfCols-1].Error * CellArray[J][NumOfCols-1].Error; } GenReport(-1); } cout << endl; cout << "Sum Squared Error for Training cases = " << TotalError << endl; cout << "% of Training Cases that meet criteria = " << ((double)GoodCount/(double)TrainCases) << endl; cout << endl; } /******************************************* Final Report *******************************************/ void FinReport(int CIterations) { cout.setf(ios::fixed); cout.setf(ios::showpoint); cout.precision(4); if (CIterations<CritrIt) cout << "Network did not converge" << endl; else cout << "Converged to within criteria" << endl; cout << "Total number of iterations = " << Iteration << endl; } /******************************************* Generation Report pass in a -1 if running test cases 38 *******************************************/ void GenReport(int Iteration) { int J; cout.setf(ios::fixed); cout.setf(ios::showpoint); cout.precision(4); if (Iteration == -1) { for (J=0; J < NumRowsPer[0]-1; J++) cout << " " << ScaleOutput(Inputs[J],J); cout << " "; for (J=0; J < NumOUTs; J++) cout << " " << ScaleOutput(DesiredOutputs[J],NumINs+J); cout << " "; for (J=0; J < NumOUTs; J++) cout << " " << ScaleOutput(CellArray[J][NumOfCols-1].Output,NumINs+J); cout << endl; } else if ((Iteration % ReportIntv) == 0) { cout << " " << Iteration << " "; for (J=0; J < NumRowsPer[0]-1; J++) cout << " " << ScaleOutput(Inputs[J],J); cout << " "; for (J=0; J < NumOUTs; J++) cout << " " << ScaleOutput(DesiredOutputs[J],NumINs+J); cout << " "; for (J=0; J < NumOUTs; J++) cout << " " << ScaleOutput(CellArray[J][NumOfCols-1].Output,NumINs+J); cout << endl; } } /************************************************** HAAGPerformancePrediction -------------------------------------------------Ameya Deo, California State University Sacramento -------------------------------------------------Version 1.0 - March 2012 -------------------------------------------------Code for running HAAGPerformancePrediction ****************************************************/ a = get("/CPU/CPU/CPUCpuUtil/value"); b = get("/MEMORY/MEMORY/MEMUsedMemPerc/value"); c = get("/FILESYSTEM/opt-maestro/FSCapacity/value"); d = get("/FILESYSTEM/root/FSCapacity/value"); fileName = "/tmp/asd/TestingData1.dat"; f1 = fopen(fileName,"w"); string = sprintf("%f %f %f %f 0 ",a/100,b/100,c/100,d/100); write(f1,string); close(f1); cmd = "/tmp/asd/pp.sh"; execute("OS", cmd); fileName = "/tmp/asd/Output.dat"; f2 = fopen(fileName,"r"); fseek(f2,0,0); data = read(f2,1); set("/HAAG/HAAG/HAAGPerformancePrediction/value",data); close(f2); 39 /**************************************************** Parameter Details (HAAG.km) ****************************************************/ HELP_FILE = "puk.hlp", HELP_CONTEXT_ID = 2620, OK_PICTURE = "health_ok.xpm", WRONG_PICTURE = "health_warn.xpm", PARAMETERS = { { NAME = "HAAGPerformancePrediction", PARAM_TYPE = STANDARD, ACTIVE = True, MONITOR = True, CHECK = False, BASE_COMMAND = { { COMPUTER_TYPE = "ALL_COMPUTERS", COMMAND_TYPE = "PSL", COMMAND_TEXT = 1333365744 "a = get(\"/CPU/CPU/CPUCpuUtil/value\");\ b = get(\"/MEMORY/MEMORY/MEMUsedMemPerc/value\");\ c = get(\"/FILESYSTEM/opt-maestro/FSCapacity/value\");\ d = get(\"/FILESYSTEM/root/FSCapacity/value\");\ fileName = \"/tmp/asd/TestingData1.dat\";\ f1 = fopen(fileName,\"w\");\ string = sprintf(\"%f %f %f %f 0 \",a/100,b/100,c/100,d/100);\ write(f1,string);\ close(f1);\ cmd = \"/tmp/asd/pp.sh\";\ execute(\"OS\", cmd);\ fileName = \"/tmp/asd/Output.dat\";\ f2 = fopen(fileName,\"r\");\ fseek(f2,0,0);\ data = read(f2,1);\ set(\"/HAAG/HAAG/HAAGPerformancePrediction/value\",data);\ close(f2);"} }, START = "ASAP", POLL_TIME = "60", EXTERNAL_POLLING = False, TITLE = "Perfromance Prediction (Using Nueral Network)", HISTORY_TIME = "60", HISTORY_SPAN = 0, HISTORY_LEVEL = False, FORMAT = "%f", OUTPUT = OUTPUT_GRAPH, AUTO_RESCALE = False, Y_AXIS_MIN = 0, Y_AXIS_MAX = 1, RANGES = { { NAME = "BORDER", ACTIVE = True, MINIMUM = 0, MAXIMUM = 1, STATE = OK, ALARM_WHEN = ALARM_INSTANT, ALARM_WHEN_N = 0 }, { NAME = "ALARM1", ACTIVE = False, MINIMUM = 0, MAXIMUM = 0, STATE = ALARM, ALARM_WHEN = ALARM_INSTANT, ALARM_WHEN_N = 0 }, { NAME = "ALARM2", ACTIVE = True, MINIMUM = 1, MAXIMUM = 1, STATE = ALARM, ALARM_WHEN = ALARM_INSTANT, ALARM_WHEN_N = 0 } } }, 40 APPENDIX B Dataset Dataset used for training neural network: 0.7000 0.0600 0.5900 0.1200 0.8400 0.2010 0.9000 0.1969 0.6540 0.2100 0.7900 0.1927 1.0000 0.1952 0.1230 0.1965 0.9236 0.2084 0.1955 0.1984 0.1778 0.1992 0.2072 0.1994 0.2036 0.1953 0.1980 0.2028 0.1936 0.2101 0.1957 0.2057 0.1982 0.2015 0.1978 0.2009 0.2017 0.1982 0.2059 0.1978 0.1946 0.2057 0.1982 0.2017 0.2005 1.0000 0.7923 0.2933 0.1230 0.6523 0.0645 0.7100 0.2457 0.7709 0.5187 0.8121 0.2368 0.9913 0.2367 0.9612 0.2367 1.0000 0.2683 1.0000 0.2368 0.7090 0.2368 0.2384 0.2384 0.2384 0.2381 0.2380 0.2380 0.2381 0.2380 0.2380 0.2380 0.2380 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 0.2377 1.0000 0.9902 0.9909 1.0000 0.7312 0.0184 0.8345 0.2786 0.6346 0.0712 0.7000 0.3542 0.8910 0.3542 0.5933 0.3421 0.0200 0.5421 0.0012 0.3542 0.0210 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.3542 0.1212 0.2933 0.8122 0.0012 0.7434 0.0990 0.9678 0.2678 0.6838 0.5433 0.5634 0.5561 0.7135 0.5565 0.8123 0.5565 0.0500 0.5651 0.9903 0.0965 0.5622 0.5065 0.5566 0.5566 0.5566 0.5566 0.5566 0.5566 0.5566 0.5566 0.5566 0.5566 0.5566 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.5565 0.2001 0.8123 0.8242 0.9903 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 41 APPENDIX C Results Case 1: Parameter values for a non busy system Test Data: CPUCpuUtil = 3.383 MEMUsedMemPerc = 27.288 FSCapacity (OPT-Maestro) = 3.542 FSCapacity (Root) = 56.19 RESULT: Sum Squared Error for Training cases = 0.0129 % of Training Cases that meet criteria = 1.0000 Running Test Cases 0.0317 0.2763 0.0354 0.5620 0.0000 -0.0318 Sum Squared Error for Testing cases = 0.0008 % of Testing Cases that meet criteria = 1.0000 42 Case 2: Parameter values for a moderately busy system Test Data: CPUCpuUtil = 83.00 MEMUsedMemPerc = 79.00 FSCapacity (OPT-Maestro) = 3.542 FSCapacity (Root) = 56.19 RESULT: Sum Squared Error for Training cases = 0.0128 % of Training Cases that meet criteria = 1.0000 Running Test Cases 0.8300 0.7900 0.0354 0.5619 0.0000 0.9989 Sum Squared Error for Testing cases = 0.8082 % of Testing Cases that meet criteria = 0.0000 43 Case 3: Parameter values for a busy system Test Data: CPUCpuUtil = 100.00 MEMUsedMemPerc = 85.00 FSCapacity (OPT-Maestro) = 25.50 FSCapacity (Root) = 40.89 RESULT: Sum Squared Error for Training cases = 0.0128 % of Training Cases that meet criteria = 1.0000 Running Test Cases 1.0000 0.8500 0.2550 0.4089 0.0000 1.0203 Sum Squared Error for Testing cases = 0.8433 % of Testing Cases that meet criteria = 0.0000 44 BIBLIOGRAPHY 1. Neural Computing: THEORY AND PRACTICE By Philip D. Wasserman Anza Research, Inc. Publisher: VAN NOSTRAND REINHOLD, 1989 2. [BMCCS] BMC Software’s Customer Support Website www.bmc.com/support Date Accessed: November 15th 2011 3. [WIKIBP] Wikipedia Backpropagation Algorithm http://en.wikipedia.org/wiki/Backpropagation Date Accessed: December 2nd 2011 4. NEURAL NETWORKS Author : Christos Stergiou and Dimitrios Siganos http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html Date Accessed: March 5th 2012 5. Source Code, Neural Network with Backpropagation, Version 3.1 Adapted from D. Whitley, Colorado State University Modifications by S. Gordon 6. [1] Modified from BMC Software’s Patrol Agent Reference Manual Version 3.9.00 www.bmc.com/support Date Accessed: November 15th 2011