Data Reduction by Locally Linear Embedding and its Applications to Tumor Classification ABSTRACT Hao Xiong Department of Computer Science Texas A&M University Gene expression profiles may offer more information than morphology and provide an alternative to morphology-based tumor classification systems. However, gene expression profiles have too high dimensionality, which will severely compromise the classification accuracy of gene expression based tumor classification. High dimensionality of gene expression data raises the fundamental problem of dimensionality reduction in gene expression data analysis. Reducing the dimensionality is a key issue in gene expression data analysis. Principle component analysis is a classical method for data reduction. As an alternative to principle component analysis for data reduction, in this report, I present locally linear embedding (LLE) method for dimension reduction. The LLE method calculates low-dimensional embeddings of highdimensional gene expression data, which preserves the neighborhood topology of the original data. It is recently recognized that LLE has high ability to learn the global structure of nonlinear function of the data. The LLE method for data reduction was applied to classifying 22 colon normal tissue and 40 colon tumor tissue. The results are very promising.