CELEST Application Building Framework for CN710: General Instructions This document describes how to build data sets, classifiers and evaluation criteria for use in a java/eclipse/weka based development environment. 1. Data set Construction 1. In Matlab, generate a 2-dimension matrix called "LoadMatrix", where columns represents different features, and rows represents different data. 2. Save the matrix "LoadMatrix" into some .mat file. 2. Download makewikafilesGeneral.m: http://www.cns.bu.edu/~chhsiao/makewikafilesGeneral.m 3. To generate a .txt file which will become the .arff file for weka, run makewikafilesGeneral(SourcePathAndFileName, TargetFileName) 4. Rename the .txt file to .arff file, then weka should be able to recognize it (See Section 2.7). 2. Classifier Construction 1) Setup the java/eclipse/weka IDE 1.1) Download WEKA from http://www.cs.waikato.ac.nz/ml/weka/ Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Download the self-extracting executable without the Java VM Save that to your disk probably to c:\Program Files 1.2) Install Eclipse: Eclipse is an Integrated Development Environment (IDE) that is used for Java Programming. 1.2.1) Go to https://cps.bu.edu/sites/support/devsys/wiki/guide/eclipse This has instructions to download Eclipse and for programmers to use Eclipse IDE to manipulate Java code. 1.2.2) When your workspace opens up, on the Toolbar, go to the Window menu ->Open Perspective->SVN Repository Exploring 1.2.3) On the SVN Repository Explorer tab on the left side in the Eclipse environment, right click - > select New –> Repository Location –> Type https://cps.bu.edu/src/celest/learninggame in the text box –> Click Finish 1.2.4) Expand https://cps.bu.edu/src/celest/learninggame -> Right Click “learning-game”->checkout(you might be asked for user id and password- enter your edlab user id and passowrd) ->Next -> Finish 1.2.5) If you have not already checked out the jarch projects, you need to do the following (jarch projects are reusable components that are used to develop projects in the CELET framework) 1 1.2.5.1) On the SVN Repository Explorer tab on the left side in the Eclipse environment, right click - > select New –> Repository Location –> Type https://cps.bu.edu/src/support/jarch in the text box –> Click Finish 1.2.5.2) Expand https://cps.bu.edu/src/support/jarch -> Expand trunk-> Right Click on each folder(jarchsimulation,jarch-data,jarch-util,jarch-gui,jarch-glazedlists,jarch-application)->checkout(you might be asked for user id and password- enter your edlab user id and passowrd) ->Next – Finish (While checking out do not change the name). 1.2.6) Once all the jarch and “learning-game” projects are checked out, shift to the java perspective->select the jarch-data,jarch-glazedlists,jarch-gui,jarch-simulation,jarch-util,jarch-application and learning-game>right click->select Team->select Update 1.3) Include weka.jar as your library collection : 1.3.1) On the toolbar, go to the Window menu-Open Perspective-Java 1.3.2) On the Package Explorer tab, on the left side in the Eclipse environment, right click “learning-game” ->build path->configure build path->Select the libraries tab-> select weka.jar \home\fs\student1\chhsiao\localhd\celest\bin (or something similar to this)-> Click Remove- >click Add External JARs- >select the weka.jar file from the weka folder that you have downloaded –> Click OK 1.4) Set up New Java Class for classifier code 1.4.1) On the Package Explorer tab in the Eclipse environment, right click on weka.classifiers.functions-> New –> Class -> Enter a name (eg: TrialTemp) ->Finish 1.5) Write classifier code as a Java class. See http://www.mindview.net/Books/TIJ/ for information about Java Programming 1.5.1) In the window opened for TrialTemp.java, include the following code as an example class to show how to classify the data. This is an example code. 2 import weka.classifiers.Classifier; import weka.core.Instances; import weka.core.Instance; //This is the main class that extends from the class Classifier public class TrialTemp extends Classifier { double t = 0; /** This function lets you to train the classifier by giving your code here*/ public void buildClassifier(Instances data) { } //this classifier function lets you to test the instance using the public double classifyInstance(Instance instance) { return 1; } //this options function is used to show the public String [] getOptions() { String [] options = new String [2]; int current = 0; //if (getDebug()) // options[current++] = "-D"; options[current++] = "-t"; options[current++] = "" + getT(); while (current < options.length) options[current++] = ""; 3 list of the parameter return options; } //gets the parameter public double getT() { return t; } //set the parameter public void setT(double temp) { t = temp; } //can return some text, that can be used for comments or help purpose public String TTipText() { return "This is a test"; } } 1.5.2) Save your program 2) The following outlines what the different functions in the code do. 2.1) buildClassifier(Instances data) to train the classifier. public void buildClassifier(Instances data) { } Information of class Instances is in Weka documentation. http://weka.source.forge.net/doc/ http://weka.source.forge.net/doc/weka/core/Instances.html 2.2) classifyInstance (Instance instance) to test the instance. public double classifyInstance(Instance instance) { return 1; 4 } Information of class Instance is in Weka documentation http://weka.source.forge.net/doc/weka/core/Instance.html 2.3) To list and change parameters, gotOptions() generates UI of parameters. public String [] getOptions() { String [] options = new String [2]; int current = 0; //if (getDebug()) // options[current++] = "-D"; options[current++] = "-t"; options[current++] = "" + getT(); while (current < options.length) options[current++] = ""; return options; } For each parameter: getT(Parameter) – gets the parameter values and returns the value setT(Parameter) – sets the parameter values that are passed to it (Parameter)TipText –sets the text or comments needed. 3 (Optional steps to make your own classifier directory) 3.1) Right Click the project “learning-game” -> New -> Folder ->from the list -> select “learning-game” >weka-classifiers and type your new folder name ->Finish. 3.2) Put GenericPropertiesCreator.props into the directory. 3.3) Modify GenericPropertiesCreator.props to make WEKA recognize the directory 4) Run your java program Right click the project “learning-game”, Run As-> SWT Application-> from the list box select “LearningGame” ->Click OK 5 5) You will get a CELEST Weka Explorer Click This Click on the Preprocess tab 6)Click on Open File 6 Click 7)Select a dataset example :Contact lenses Click 8)You will get a data set window. Select the classify tab Classify 7 9) Click the Choose button Choose 10) A drop-down menu box will appear and from weka-classifiers-functions select your java file (Example: TrialTemp) 11) On the textbox you can change the parameters for TrialTemp -t 0.0 to any value Change Parameters if needed Start 12) Click the start button to view the classifier output 8 13)Sample code for classifier is available at http://www.cns.bu.edu/~chhsiao/LeastMedSq.java Full code base and function object available at http://www.cns.bu.edu/~chhsiao/cn710_Chuan-Heng_Hsiao_code.zip (See these example to use weka/gui/GenericPropertiesCreator.props to generate a sample classifier) 3. Evaluation Criteria Construction 3.1) On the Package Explorer tab in the Eclipse environment, edu.bu.cps.celest.learninggame.eval- New – Class- Enter a classname(eg: Trial) right click on 3.2) In the window opened for Trial.java, include the following code snippet as an example code to make your on evaluation criterion import edu.bu.cps.celest.learninggame.eval.Eval; import weka.core.Instance; public class Trial extends Eval { int count = 0; public Trial() { } public void Evaluate(Instance inst, double pred) throws Exception { if(pred == inst.classValue()) count++; else count--; System.out.println("Temp::Evaluate: count: " + count); } public StringBuffer Output() { out_buff.append("Temp: " + count); return out_buff; } } 3.3) Save the program 3.4) Using the evaluation criteria 3.4.1) Follow the steps from Section 2.4 to run the CELEST Weka Explorer and select evaluation criteria for output 9 Click here to choose the classifier Click here to select the evaluation criteria Click Choose the evaluation criteria code that you have written Click This is a sample final result that you get (depends on the options you choose) 10