A distributed PSO“SVM hybrid system with feature selection and

advertisement
A distributed PSO–SVM hybrid system
with feature selection and parameter
optimization
Cheng-Lung Huang & Jian-Fan Dun
Soft Computing 2008
Introduction
Hybridizing the particle swarm optimization
(PSO) and support vector machines (SVM) to
improve the classification accuracy with a small
and appropriate feature subset.
 Combining the discrete PSO with the continuousvalued PSO
 Implementing via a distributed architecture using
the web service technology to reduce the
computational time.

Introduction
The continuous-valued version is used to
optimize the best SVM model parameters.
 The discrete version is used to search the optimal
feature subset.
 PSO can be easily adopted for parallel processing
by distributed system.

Support Vector Machine
Kernel Function: RBF (C and Gamma )
 Multi-class strategies:
one-against-one (adapt in this study)
one-against-all

Particle swarm optimization






Rnd( ) is a random function in the range[0, 1]
Positive constant c1 and c2 are personal and social
learning factors.
w is the inertia weight and Inertia weight balances the
global exploration and local exploitation.
Pi,d denote the best previous position encountered by the
ith particle.
Pg,d denotes the global best position thus far.
t denotes the iteration counter.
Particle swarm optimization

The new position of a particle is calculated using
the following formula:
Binary PSO

The function S(v) is a sigmoid limiting
transformation and rnd( ) is a random number
selected from a uniform distribution in [0, 1].
Particle representation
Features mask (discrete-valued)
 C (continuous-valued)
 Gamma (continuous-valued)

Fitness definition
WA: SVM classification accuracy weight
 acci: SVM classification accuracy
 WF: weight of the features
 f j :the value of feature mask-‘‘1’’represents that
feature j is selected and ‘‘0’’ represents that
feature j is not selected.
 nF : the total number of features.

Strategies for setting the inertia weight
Data descriptions
There are eight target classes that need to be
classified in this data set.
 The data set has 30 features that only five of them
(f5, f10, f15, f20, and f25) are relevant to the
eight classes.

Experimental procedures
Randomly split the data into ten groups using
stratified 10-fold cross validation.
 Each group contains training, validation and test
sets.
 The training set is used to build the SVM model.
 The validation set is used to determine the proper
training iteration to avoid overtraining
 The test set is used to evaluate the model’s
classification accuracy.

Setting of the system parameters
Experimental procedures
Experimental procedures
Experimental procedures
Experimental results
Experimental results
HITF : the number of hits on correct features.
 COVERF : the number of times the selected
feature subset covered the correct features.
 RATIOF : the ratio of correct features for the ten
experiments (10-fold CV).

Experimental results


f : denote the selected feature subset by the PSO.
F : denote correct discriminating features (f5, f10, f15,
f20,and f25 in this experiment),
Experimental results
Fitness
Distributed architectures
CPU Time
Conclusions
Input feature subset selection and the kernel
parameters setting are crucial problems.
 This study proposed a new hybrid PSO–SVM
system to solve these two problems.
 To overcome the long training time when dealing
with a large-scale dataset, the PSO–SVM can be
implemented with a distributed parallel
architecture.

Thank You
Download