Data Reduction by Locally Linear Embedding and its Applications to

advertisement
Data Reduction by Locally Linear Embedding and its Applications to Tumor
Classification
ABSTRACT
Hao Xiong
Department of Computer Science
Texas A&M University
Gene expression profiles may offer more information than morphology and provide an
alternative to morphology-based tumor classification systems. However, gene
expression profiles have too high dimensionality, which will severely compromise the
classification accuracy of gene expression based tumor classification. High
dimensionality of gene expression data raises the fundamental problem of
dimensionality reduction in gene expression data analysis. Reducing the dimensionality
is a key issue in gene expression data analysis. Principle component analysis is a
classical method for data reduction. As an alternative to principle component analysis
for data reduction, in this report, I present locally linear embedding (LLE) method for
dimension reduction. The LLE method calculates low-dimensional embeddings of highdimensional gene expression data, which preserves the neighborhood topology of the
original data. It is recently recognized that LLE has high ability to learn the global
structure of nonlinear function of the data. The LLE method for data reduction was
applied to classifying 22 colon normal tissue and 40 colon tumor tissue. The results are
very promising.
Download