A Hybrid Approach to Estimate True Density Function for Gene Expression Data

DOI: 10.1007/978-3-642-24055-3_5

ABSTRACT Accurate classification of diseases from microarray gene expression profile is a challenging task because of its high dimensional
low sample data. Most of the gene selection methods discretize the continuous-valued gene expression data for estimating the
marginal and joint probabilities that results in inherent error during discretization and reduces the classification accuracy.
To overcome this difficulty, a hybrid fuzzy-rough set approach is proposed that generates a fuzzy equivalence class and constructs
a fuzzy equivalence partition matrix to estimate the true density function for the continuous-valued gene expression data
without discretization. The performance of the proposed approach is evaluated using six gene expression data. f-Information measure is used for gene selection and back propagation network is used for classification. Simulation results
show that the proposed method estimate the true density function correctly without discretizing the continuous gene expression
values. Further the proposed approach performs the integration required to computef-Information measure easily and results in highly informative genes that produces good classification accuracy.

KeywordsGene Expression profiles–Fuzzy-Rough Set–
f-Information–Back Propagation Network

5 Reads
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Cancer classification is one major application of microarray data analysis. Due to the ultra high dimensionality nature of microarray data, data dimension reduction has drawn special attention for such type of data analysis. The currently available data dimension reduction methods are either supervised, where data need to be labeled, or computational complex. In this paper, we proposed to use a revised locally linear embedding(LLE) method, which is purely unsupervised and fast as the feature extraction strategy for microarray data analysis. Three public available microarray datasets have been used to test the proposed method. The effectiveness of LLE is evaluated by the classification accuracy of a SVM classifier. Generally, the results are promising.
    Proceedings of 3rd Asia-Pacific Bioinformatics Conference, 17-21 January 2005, Singapore; 01/2005