Discriminant Analysis Methods for Microarray Data Classification

DOI: 10.1007/978-3-540-89378-3_26
Source: DBLP


The studies of DNA Microarray technologies have produced high-dimensional data. In order to alleviate the “curse of dimensionality”
and better analyze these data, many linear and non-linear dimension reduction methods such as PCA and LLE have been widely
studied. In this paper, we report our work on microarray data classification with three latest proposed discriminant analysis
methods: Locality Sensitive Discriminant Analysis (LSDA), Spectral Regression Discriminant Analysis (SRDA), and Supervised
Neighborhood Preserving Embedding (S-NPE). Results of experiments on four data sets show the excellent effectiveness and efficiency
of SRDA.

3 Reads
  • Source
    Annals of Human Genetics 01/1936; 7(7):179-188. DOI:10.1111/j.1469-1809.1936.tb02137.x · 2.21 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. It has been widely used in many fields of information processing, such as machine learning, data mining, information retrieval, and pattern recognition. However, the computation of LDA involves dense matrices eigen-decomposition which can be computationally expensive both in time and memory. Specifically, LDA has O(mnt + t<sup>3</sup>) time complexity and requires O(mn + mt + nt) memory, where m is the number of samples, n is the number of features and t = min (m,n). When both m and n are large, it is infeasible to apply LDA. In this paper, we propose a novel algorithm for discriminant analysis, called Spectral Regression Discriminant Analysis (SRDA). By using spectral graph analysis, SRDA casts discriminant analysis into a regression framework which facilitates both efficient computation and the use of regularization techniques. Our theoretical analysis shows that SRDA can be computed with O(ms) time and O(ms) memory, where s(les n) is the average number of non-zero features in each sample. Extensive experimental results on four real world data sets demonstrate the effectiveness and efficiency of our algorithm.
    Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on; 05/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We describe the use of singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent additive or multiplicative noise, experimental artifacts, or even irrelevant biological processes enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.
    Proceedings of SPIE - The International Society for Optical Engineering 12/2001; 4266. DOI:10.1117/12.427986 · 0.20 Impact Factor
Show more