Particle Swarm Optimization (PSO) and ELM based two

advertisement
Particle Swarm Optimization (PSO) and ELM based two-step
approach for cancer classification
S. Saraswathi1, S. Suresh 2, N. Sundararajan3
Research, BIRC, School of Computer Engineering, Nanyang Technological University,
Singapore, 639735. E-mail : Saraswathi55@gmail.com, ensundara@ntu.edu.sg
Particle Swarm Optimization (PSO) and ELM based two-step approach for cancer
classification In this paper, multi-class human cancer detection problem, using microarray gene expression data characterized by sparse data (GCM data set) is studied. We try
to classify data belonging to different types of cancers. The cancer detection problem has
a very small number of samples with large gene expression features. Finding the
influences of gene features on a particular class or selecting appropriate genes to identify
the cancer type is an open problem and are very pertinent to bioinformatics problems.
The issues in the classification problem are in two-fold. One is selection of appropriate
genes (features) from the given features and the other is extracting the unknown
functional relationship between the selected features and true class label. The issue of
finding the appropriate features from a given feature space is an NP-hard problem. One
can use search techniques to solve the issue. Among the various available search
techniques, recently developed biologically inspired Particle Swarm Optimization (PSO)
technique is computationally less intensive and can provide better solution than other
search techniques. For the second issue, neural network methods, particularly the recently
developed Extreme Learning Machine (ELM) is well suited to solve the problem,
particularly when the relationship between entities are not yet clearly defined. ELM is a
single hidden layer neural network with good generalization capabilities and extremely
fast learning. In this paper, a two-step generic solution is presented. The recently
developed Particle Swarm Optimization (PSO) method will be used for selecting the
appropriate features and ELM algorithm for extracting the functional relationship
between the selected features and class labels. The performance of the proposed two-step
solution will be compared with the existing statistical selection schemes using GCM data
Download