A Novel Visualization Classifier and Its Applications.
ABSTRACT Classifiers, as one of the important tools of analyzing gene expression data in the post-genomic epoch, have been used widely
in the classification of different cancer types in the past few years. Although most existing classifiers have high classification
accuracy, the process of classification is a black box and they can not give biologists more information and interpretable
results of classification. In this paper, we propose a novel visualization cancer classification method. Besides offering
high classification accuracy, the method can help us identify complex disease-related genes and assess gene expression variation
during the process of classification. The results of classification are natural and interpretable and the process of classification
is visible. To evaluate the performance of the method we have applied the proposed method to three public data sets. The experimental
results demonstrate that the approach is feasible and useful.
- SourceAvailable from: Jie Li[Show abstract] [Hide abstract]
ABSTRACT: Various microarray experiments are now done in many laboratories, resulting in the rapid accumulation of microarray data in public repositories. One of the major challenges of analyzing microarray data is how to extract and select efficient features from it for accurate cancer classification. Here we introduce a new feature extraction and selection method based on information gene pairs that have significant change in different tissue samples. Experimental results on five public microarray data sets demonstrate that the feature subset selected by the proposed method performs well and achieves higher classification accuracy on several classifiers. We perform extensive experimental comparison of the features selected by the proposed method and features selected by other methods using different evaluation methods and classifiers. The results confirm that the proposed method performs as well as other methods on acute lymphoblastic-acute myeloid leukemia, adenocarcinoma and breast cancer data sets using a fewer information genes and leads to significant improvement of classification accuracy on colon and diffuse large B cell lymphoma cancer data sets.Pattern Recognition 06/2008; 41(6-41):1975-1984. DOI:10.1016/j.patcog.2007.11.019 · 2.58 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Classifiers have been widely used to select an optimal subset of feature genes from microarray data for accurate classification of cancer samples and cancer-related studies. However, the classification rules derived from most classifiers are complex and difficult to understand in biological significance. How to solve this problem is a new challenge. In this paper, a new classification model based on gene pair is proposed to address the problem. The experimental results on several microarray data demonstrate that the proposed classification model performs well in finding a large number of excellent feature gene pairs. A 100% LOOCV classification accuracy can be achieved using a single classification model based on optimal feature gene pair or combining multiple top-ranked classification models. Using the proposed method, we successfully identified important cancer-related genes that had been validated in previous biological studies while they were not discovered by the other methods.Computers in Biology and Medicine 12/2007; 37(11):1637-46. DOI:10.1016/j.compbiomed.2007.03.004 · 1.46 Impact Factor