ASSIGNMENT CODE ASSIGNMENT NAME : [not given] : Building ANN Using WEKA BUSINESS INTELLIGENCE KS091323 Understand Artificial Neural Network and Implementation with Waikato Environment for Knowledge Analysis DUE DATE 01.10.09 SUMBISSION DATE 01.10.09 EXAMINED BY : : : PREPARED BY A 1. Rama Catur APP 2. Arief Rakhman 3. Goeij Yong Sun Email addreses: 1armanke13@is.its.ac.id, 2asun@is.its.ac.id, 3 rama_si0ju@is.its.ac.id, 5207 100 077 5207 100 092 5207 100 098 Information System Department Faculty of Information Technology INSTITUT TEKNOLOGI SEPULUH NOPEMBER Building Artificial Neural Network Using WEKA Software Arief Rakhman1, Goeij Yong Sun2, Rama Catur APP.3 Information System Department, Sepuluh Nopember Institute of Technology at Surabaya, Indonesia Abstract The building process of Artificial Neural Networks (ANNs) in WEKA is using Multilayer Perceptron (MLP) function. MLP is a classifier that uses backpropagation to classify instances. The network can be built by hand, created by an algorithm or both. This study exploring one of WEKA features to build an ANN. Keywords : Artificial Neural Network, WEKA, Multilayer Perceptron. Introduction Artificial Neural Networks (ANNs) denote a set of connectionist models inspired in the behavior of the human brain. In particular, the Multilayer Perceptron (MLP) is the most popular ANN architecture, where neurons are grouped in layers and only forward connections exist [1]. This provides a powerful base-learner, with advantages such as nonlinear mapping and noise tolerance, increasingly used in Data Mining due to its good behavior in terms of predictive knowledge [2]. Human brain is a densely interconnected network of approximately 1011 neurons, each connected to, on average, 104 others. Neuron activity is excited or inhibited through connections to other neurons. The fastest neuron switching times are known to be on the order of 10 -3 sec. It is clear that human brain is beyond amazing about how fast each neuron connected with each other with their speed, 10-3 sec. So what is the connection between human brain and ANN. Ann is just like human brain. Each part of human brain can represent in ANN. For example: nucleus in human brain with node in ANN, dendrites with input, axon with output, synapse with weight. The difference are brain sometimes slows its speed but ANN not. And also, brain has up to 109 neuron while ANN has just about dozens or thousands. Literature Review Artificial Neural Network Artificial Neural Network is a mathematical model or computational model that tries to simulate the structure and/or functional aspects of biological neural networks It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to Email addreses: 1armanke13@is.its.ac.id, 2asun@is.its.ac.id, 3 rama_si0ju@is.its.ac.id, computation [10]. ANN is an adaptive system that can change structures itself based information that affect the process during computation of connectiong approach. ANN is kind of non-linear statistical data modeling tool. It usually used with complex model or to find pattern of data. Until now, there is no clear understanding about what neural network is. But experts believe that it involves a simple processing elements or we can say neuron, which can affect global behavior, based on connection between processing element and element parameter. It is inspired from human brain which has neuron, dendrits, axons, and synapses. ANN also has simple nodes which connected with other nodes to form a neural network. They work together to tranform, process, and translate signal to something understood by human. And also, ANN doesn’t need to be adaptive with each problem. It has an algorithm that can alter its weight to get desired signal strength. Multilayer Perceptron A multilayer perceptron is a feedforward artificial neural network model that maps sets of input data onto a set of appropriate output. It is a modification of the standard linear perceptron in that it uses three or more layers of neurons (nodes) with nonlinear activation functions, and is more powerful than the perceptron in that it can distinguish data that is not linearly separable, or separable by a hyper-plane. [3] Activation function If a multilayer perceptron consists of a linear activation function in all neurons, that is, a simple on-off mechanism to determine whether or not a neuron fires, then it is easily proved with linear algebra that any number of layers can be reduced to the standard two-layer input-output model (see perceptron). What makes a multilayer perceptron different is that each neuron uses a nonlinear activation function which was developed to model the frequency of action potentials, or firing, of biological neurons in the brain. This function is modeled in several ways, but must always be normalizable and differentiable. [5] Layers The multilayer perceptron consists of an input and an output layer with one or more hidden layers of nonlinearlyactivating nodes. Each node in one layer connects with a certain weight wij to every node in the following layer. [5] Applications Multilayer perceptrons using a backpropagation algorithm are the standard algorithm for any supervised-learning pattern recognition process and the subject of ongoing research in computational neuroscience and parallel distributed processing. They are useful in research in terms of their ability to solve problems stochastically, which often allows one to get approximate solutions for extremely complex problems like fitness approximation. [5] Currently, they are most commonly seen in speech recognition, image recognition, and machine translation software, but they have also seen applications in other fields such as cyber security. In general, their most important use has been in the growing field of artificial intelligence, although the multilayer perceptron does not have connection with biological neural networks as initial neural based networks have. [4] tables into a single table that is suitable for processing using Weka [9]. Another important area that is currently not covered by the algorithms included in the Weka distribution is sequence modeling. Methodology First, we explore WEKA features and function by downloading the software and dataset samples, and trying to open the datasets with the software. Then we try to build ANN based on a particular dataset. We also review journals, literatures, and scientific inter-net articles. Then we discuss it and write this paper. The scope that covered by WEKA is wider than only ANN. We focus only on the ANN building functionality. Our experiment is limited to a pre-given dataset: breast cancer. (the detail will be explained later). Result: Building ANN Using WEKA WEKA WEKA is abbreviation of Waikato Environment for Knowledge Analysis. is a popular suite of machine learning software written in Java, developed at the University of Waikato. WEKA is free software available under the GNU General Public License. [6] The Weka workbench [7] contains a collection of visualization tools and algorithms for data analysis and predictive modelling, together with graphical user interfaces for easy access to this functionality. The original non-Java version of Weka was a TCL/TK front-end to (mostly thirdparty) modelling algorithms implemented in other programming languages, plus data preprocessing utilities in C, and a Makefile-based system for running machine learning experiments. This original version was primarily designed as a tool for analyzing data from agricultural domains [11][8], but the more recent fully Java-based version (Weka 3), for which development started in 1997, is now used in many different application areas, in particular for educational purposes and research. The main strength of WEKA is freely available under GNU General Public Licence, very portable because it used Java Programming Language that can run in almost all modern platform, contains a large number of data preprocessing and modeling technique, and easy to use by beginner with its easy to use graphical user interface. Weka supports several standard data mining tasks, more specifically, data preprocessing, clustering, classification, regression, visualization, and feature selection. All of Weka's techniques are predicated on the assumption that the data is available as a single flat file or relation, where each data point is described by a fixed number of attributes (normally, numeric or nominal attributes, but some other attribute types are also supported). Weka provides access to SQL databases using Java Database Connectivity and can process the result returned by a database query. It is not capable of multi-relational data mining, but there is separate software for converting a collection of linked database This paper will explain some simple examples of how to build ANN with WEKA software. It is given some data from WEKA’s website itself to facilitate us. Before we use WEKA, it need to be installed on a computer. The installer can be downloaded from their official website at http://weka.wikispaces.com. We also need Java Runtime Environment(JRE) to run WEKA, because WEKA is based on Java programming language and need JRE so it can run. JRE is just like water for fish to live. After installation complete, then we can start explore WEKA. When we run it, it will display a simple windows with 4 modul. Explorer, Experimenter, KnowledgeFlow, and Simple CLI. For complete function and usage of each modul, you can read from the manual that available in the directory installation. But now, we only explore with Explorer modul. Fig.1 Main Window of WEKA Start using Explorer modul by click on it. It will showed a larger window of Weka Explorer. We simply use our data available from website with click open file button that placed in upside of window. Here we choose breast cancer data. Then it shows a window with attribut that we can choose chart to visualize the result. Fig.4 KnowledgeFlow window Then we start to build an flow that represent ANN flow. Fig.2 Explore of breast cancer data. It is easy to use. When we want to visualize other attribute, simply click the desired attribute then the chart will change. We also can see all chart for each attributes with click visualize all button. Fig.5 ANN Flow Here we will not study about how to make it. We just to analyze the result of the flow. We use ArffLoader to load the data, ClassAssigner to classify the data into classes, CrossValidationFoldMaker to validate produced classes, MultiLayerPerception to as the classifier, then ClassifierPerformanceEvaluator to evaluate the result of classifier, then we visualize the result using TextViewer. We can set how many hidden layer, how many training, and so on from the classifier. Simply double click on MultiLayerPerception icon. Fig.3 All chart shown by visualize all button All above is just Explorer module, which we can explore of current data that we have. Actually, the real one which can complete ANN is KnowledgeFlow modul. Then we start to use this modul by just simply click on KnowledgeFlow button. It will shows up a window. In that window, we can mae a flow from collecting data to last step of ANN. Fig. 6 Classifier Option window From here, we can set all attribute that can affect the process of how ANN works. For example, we can set how many hidden layer would be used during the flow. And then learning rate, it will affect the learning process. Too small learning rate will slow down the learning process while too large learning rate will lead to much correction results in going back and forth among possible weight values, never reaching the optimal point. And rest of the options, we can adjust it freely to meet our need. Conclusion Building ANN using WEKA is simple enough. The software is useful when we want to build simple ANN. WEKA is full of functions and features. But user must have know how the basic of data mining and the algorithm he use since he should have known the parameters and attributes that he must decide. If he don’t he should read the documentation first. The next suggestion after this study is to compare another algorithm with ANN. References [1] S. Haykin, Neural Networks—A Compreensive Foundation, second ed., Prentice-Hall, New Jersey, 1999. [2] S. Mitra, S. Pal, P. Mitra, Data mining in soft computing framework: a survey, IEEE Trans. Neural Networks 13 (1) (2002) 3–14. [3] Cybenko, G. 1989. Approximation by superpositions of a sigmoidal function Mathematics of Control, Signals, and Systems (MCSS), 2(4), 303–314. [4] Neural networks. II. What are they and why is everybody so interested in them now?; Wasserman, P.D.; Schwartz, T.; Page(s): 10-15; IEEE Expert, 1988, Volume 3, Issue 1. [5] Wikipedia The Free Encyclopedia. (2009). Multilayer_perceptron. Accessed November 4, 2009, from Wikipedia: http://en.wikipedia.org/wiki/Multilayer_perceptro n#cite_note-0 [6] http://en.wikipedia.org/wiki/Weka_(machine_lear ning) [7] Ian H. Witten; Eibe Frank (2005). "Data Mining: Practical machine learning tools and techniques, 2nd Edition". Morgan Kaufmann, San Francisco. http://www.cs.waikato.ac.nz/~ml/weka/book.html. Retrieved 2007-06-25. [8] G. Holmes; A. Donkin and I.H. Witten (1994). "Weka: A machine learning workbench". Proc Second Australia and New Zealand Conference on Intelligent Information Systems, Brisbane, Australia. http://www.cs.waikato.ac.nz/~ml/publications/19 94/Holmes-ANZIIS-WEKA.pdf. Retrieved 200706-25. [9] P. Reutemann; B. Pfahringer and E. Frank (2004). "Proper: A Toolbox for Learning from Relational Data with Propositional and Multi-Instance Learners". 17th Australian Joint Conference on Artificial Intelligence (AI2004). Springer-Verlag. http://www.cs.waikato.ac.nz/~eibe/pubs/reuteman n_et_al.ps.gz. Retrieved 2007-06-25. [9] [10] http://en.wikipedia.org/wiki/Artificial_neural_net work [11] S.R. Garner; S.J. Cunningham, G. Holmes, C.G. Nevill-Manning, and I.H. Witten (1995). "Applying a machine learning workbench: Experience with agricultural databases". Proc Machine Learning in Practice Workshop, Machine Learning Conference, Tahoe City, CA, USA. pp. 14-21. http://www.cs.waikato.ac.nz/~ml/publications/19 95/Garner95-imlc95.pdf. Retrieved 2007-06-25.