Are you S.K. Pal?

Claim your profile

Publications (3)7.82 Total impact

  • Article: Fuzzy–Rough Sets for Information Measures and Selection of Relevant Genes From Microarray Data
    P. Maji, S.K. Pal
    [show abstract] [hide abstract]
    ABSTRACT: Several information measures such as entropy, mutual information, and f -information have been shown to be successful for selecting a set of relevant and nonredundant genes from a high-dimensional microarray data set. However, for continuous gene expression values, it is very difficult to find the true density functions and to perform the integrations required to compute different information measures. In this regard, the concept of the fuzzy equivalence partition matrix is presented to approximate the true marginal and joint distributions of continuous gene expression values. The fuzzy equivalence partition matrix is based on the theory of fuzzy-rough sets, where each row of the matrix represents a fuzzy equivalence partition that can automatically be derived from the given expression values. The performance of the proposed approach is compared with that of existing approaches using the class separability index and the predictive accuracy of the support vector machine. An important finding, however, is that the proposed approach is shown to be effective for selecting relevant and nonredundant continuous-valued genes from microarray data.
    IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) 07/2010; · 3.08 Impact Factor
  • Article: Feature Selection Using f-Information Measures in Fuzzy Approximation Spaces
    P. Maji, S.K. Pal
    [show abstract] [hide abstract]
    ABSTRACT: The selection of nonredundant and relevant features of real-valued data sets is a highly challenging problem. A novel feature selection method is presented here based on fuzzy-rough sets by maximizing the relevance and minimizing the redundancy of the selected features. By introducing the fuzzy equivalence partition matrix, a novel representation of Shannon's entropy for fuzzy approximation spaces is proposed to measure the relevance and redundancy of features suitable for real-valued data sets. The fuzzy equivalence partition matrix also offers an efficient way to calculate many more information measures, termed as f-information measures. Several f-information measures are shown to be effective for selecting nonredundant and relevant features of real-valued data sets. This paper compares the performance of different f-information measures for feature selection in fuzzy approximation spaces. Some quantitative indexes are introduced based on fuzzy-rough sets for evaluating the performance of proposed method. The effectiveness of the proposed method, along with a comparison with other methods, is demonstrated on a set of real-life data sets.
    IEEE Transactions on Knowledge and Data Engineering 07/2010; · 1.66 Impact Factor
  • Article: Rough Set Based Generalized Fuzzy -Means Algorithm and Quantitative Indices
    P. Maji, S.K. Pal
    [show abstract] [hide abstract]
    ABSTRACT: A generalized hybrid unsupervised learning algorithm, which is termed as rough-fuzzy possibilistic C-means (RFPCM), is proposed in this paper. It comprises a judicious integration of the principles of rough and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in class definition, the membership function of fuzzy sets enables efficient handling of overlapping partitions. It incorporates both probabilistic and possibilistic memberships simultaneously to avoid the problems of noise sensitivity of fuzzy C-means and the coincident clusters of PCM. The concept of crisp lower bound and fuzzy boundary of a class, which is introduced in the RFPCM, enables efficient selection of cluster prototypes. The algorithm is generalized in the sense that all existing variants of C-means algorithms can be derived from the proposed algorithm as a special case. Several quantitative indices are introduced based on rough sets for the evaluation of performance of the proposed C-means algorithm. The effectiveness of the algorithm, along with a comparison with other algorithms, has been demonstrated both qualitatively and quantitatively on a set of real-life data sets.
    IEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics) 01/2008; · 3.08 Impact Factor