D2C-MatManual

advertisement
D2C-SVM Matlab (v 1.0) Manual
By D. Lai Copyright 2006.
Department of Electrical and Electronic Engineering,
University of Melbourne, Parkville Campus, Melbourne, Australia.
Email: d.lai@ee.unimelb.edu.au.
Figure 1: Easy to use GUI interface.
Introduction
Welcome to the Matlab GUI interface written for easy use of the D2C-SVM software for
training Support Vector Machines (SVM). What is SVM? SVM is a function estimation
technique based on Statistical Learning Theory (1970), introduced by V. Vapnik since the
early 1990s. The standard SVM is a binary classifier which has found widespread use in
pattern recognition problems such as image and audio recognition, handwriting
recognition, medicine, science, finance and so on. For an introduction to SVM theory,
one can try the following references
a) Cristianini, N. and Shawe-Taylor, J., An introduction to Support Vector Machines :
and other kernel-based learning methods, Cambridge University Press, New York,
2000.
b) Vapnik, V. N., The nature of statistical learning theory, Springer, New York,
2000.
D2C-SVM is yet another SVM training package which implements a heuristic training
algorithm for improving the training efficiency of the SVM. This short manual is an
attempt to help the user with installing and running the Matlab GUI interface use of the
D2CSVM training package and to introduce the user to the usage of the SVM as an easy
to use pattern recognition tool.
Installation and System Requirements
D2CMatlab v 1.0 comes in a single zip file named D2CMatlab.zip which can be
downloaded from
http://www.ee.unimelb.edu.au/people/dlai/
This first version is intended to be used with the D2C-SVM classifier package and is
included in the zip file. The user needs to have Matlab v 6.01 and above installed in order
to run the GUI file. Generally, the latest computers running of Pentium IV with more
than 256 MB would have no problems in running this software. Earlier PCs shouldn’t
suffer either, just that the training of the SVM tends to become slower as your data size
increases.
Input Files
The program (v1.0) currently accepts either a training or test file in sparse format. For
example, a training example in the file will have the following format.
<label> 1: attribute 1
2: attribute 2 3: attribute 3……
e.g.
+1 1: 0.3421 3: 2.3424 5:-1.2342
- 1 4: 23.31
If your data is in matrix format, this can be easily read in using Matlab’s load function or
if the data is in an Excel file, this can be read using xlsread. The data file must then be
converted to sparse format in order to be properly used by the program.
An Excel to Spare file converter is provided in the D2CMatlab tool directory and can be
used for fast and easy conversion of data files not in sparse format.
SVM parameters
Figure 2: SVM Parameter Selection Panel
Experienced SVM users will find that the GUI interface allows easy input and changing
of SVM parameters so that the training can be painlessly done. This is a big advantage
compared to programming the DOS command line interface which is found in many of
the other SVM packages. The user can change the C, algorithm tolerance levels, kernel
parameters and options.
For new SVM users, the following a simple run down of the options available for use.
a) C: The SVM penalty parameter controls the tradeoff between overfitting and
underfitting the classifier to the training set. If C is set large, the number of
training errors will be reduced, but the result of doing this is that the classifier is
performs poorly on test data (ovefitted). If C is too small, the number of training
errors will increase and the classifier performs poorly on the training set, but does
NOT necessarily perform well on the test data. One could end up with a really bad
classifier which doesn’t classify anything well. Generally, set C=10 and check the
classification accuracies, then adjust the parameter to obtain a better model.
C Cycle: Selecting this option allows easy training of the SVM model using
increasing values of C. The C step determines the multiplicative factor with which
C is increased. Press ‘Run’ again to automatically train different SVMs with
increasing C values.
b) Tol: This is the tolerance level with which the classifier is optimized to and is
intended to allow finite termination of the algorithm. The default is 0.001 which
seems to work well for practical purposes. Smaller tolerances tend to increase
training time while giving better end accuracies and vice versa.
c) Kernel Parameters: The standard three popular kernels are included namely
linear, polynomial and RBF (Gaussian) kernel. Future versions of this software
will include other kernels e.g. sigmoid and so on.
d) Model: Describes the training algorithm to be used. Currently the D2C Adaptive
Heuristic is the default which seems to work well for most data sets. The standard
max violater algorithm is used in other SVM algorithms, the Naïve algorithm is
generally slower and the ASVM models are semiparametric version of the SVM
classifier.
Status Window
Figure 3: Status Window
The status window provides information of the simulation status. The window is color
coded where GREEN means the simulation is ready for user input e.g. before start, after
simulation has finished and RED means that the simulation is running and the user cannot
input or change any variables further.
a) Run: Executes the simulation with the selected choice of SVM parameters.
b) Reset: Resets all variables to default values, clears buffers and clears the graph
axes.
c) Plot Graph: Plots the various graph depending on the analysis selections made.
d) Save Graph: Opens the graph in a Matlab figure to allow saving, adjusting of
axes and exporting to other formats.
e) Exit: Closes the simulation and returns to Matlab command line.
Analysis Tools
Figure 4 Analysis Tools Panel
The current version of this software supports two types of analysis, namely accuracy
analysis and graphical analysis. In accuracy analysis, the standard cross validation and
leave one out accuracies have been implemented, while the graphical analysis tool set
currently contains 5 different methods.
a) Cross Validation: Partitions the main data set into n-number of subsets called
folds. The user selects the number of folds to perform the experiment on. The
average accuracy over the n-folds is then reported along with sensitivity and
specificity ratios.
b) Leave One Out: This is actually the extreme of the cross validation where the
SVM is trained n times by using n-1 data points and 1 data point for testing. In
short the average accuracy is obtained by leaving each point out of the main data
set and testing the classifier trained on the remaining data.
Sensitivity: This is a measure of how good the classifier is at classifying positive
examples.
Specificity: This is a measure of how good the classifier is at classifying negative
examples.
c) ROC Plot: This plots the receiver operating characteristics (ROC) of the
classifier. A larger ROC area generally means a better performing classifier.
d) Best Feature Selection: This automatically selects the set of best features from
the training data for a particular SVM model. The feature numbers are written to
an output file “BestFeature.txt”. Selection of the best features are based on a hill
climbing method using accuracies that can be determined using n-fold cross
validation of leave one out errors.
e) 2D SVM Surface: Plots the SVM decision surface in input space for 2 features.
This is a useful graphical option to view the performance of the classifier if you
only want 2 features and the resulting separating boundary. It is not meant to be
used to decide the best SVM parameters or classification accuracy for data sets
with more than 2 features.
f) 3D SVM Surface: This adds a third feature to option e) and allows similar
visualization. Unfortunately dimension 4 and higher are slightly more complex to
be viewed graphically and hence are not included here.
g) SVM Posterior Probability: This plots a graph of SVM outputs for the test data
which have been calibrated to posterior probabilities. A test example may have an
SVM output of say 5.34 which is only useful for classifying it to the +1 class. The
conversion to posterior probability denotes how sure we are that the example
belongs to +1 class. Values range from 0 to 1.
Figure 5: ROC Graph Analysis
Disclaimer: The author is not liable for any damages what so ever that may occur from
usage of this software. This software is copyrighted and available can be distributed
freely for use. Any modifications should be made only after written consent from the
author. Please cite the following when using this software in your research works.
BibTeX citation:
@misc{D2CSVM,
author = {Lai, D.},
title = {{D2C-SVM}: A heuristic algorithm for training {S}upport {V}ector
{M}achines},
year = {2005},
note = {Software available at {\tt http://www.ee.unimelb.edu.au/people/dlai/}}
Send all email queries, bug reports, criticisms and opinions to
daniel.thlai@hotmail.com
Download