Mem-PHybrid: hybrid features-based prediction system for classifying membrane protein types.
ABSTRACT Membrane proteins are a major class of proteins and encoded by approximately 20% to 30% of genes in most organisms. In this work, a two-layer novel membrane protein prediction system, called Mem-PHybrid, is proposed. It is able to first identify the protein query as a membrane or nonmembrane protein. In the second level, it further identifies the type of membrane protein. The proposed Mem-PHybrid prediction system is based on hybrid features, whereby a fusion of both the physicochemical and split amino acid composition-based features is performed. This enables the proposed Mem-PHybrid to exploit the discrimination capabilities of both types of feature extraction strategy. In addition, minimum redundancy and maximum relevance has also been applied to reduce the dimensionality of a feature vector. We employ random forest, evidence-theoretic K-nearest neighbor, and support vector machine (SVM) as classifiers and analyze their performance on two datasets. SVM using hybrid features yields the highest accuracy of 89.6% and 97.3% on dataset1 and 91.5% and 95.5% on dataset2 for jackknife and independent dataset tests, respectively. The enhanced prediction performance of Mem-PHybrid is largely attributed to the exploitation of the discrimination power of the hybrid features and of the learning capability of SVM. Mem-PHybrid is accessible at http://www.22.214.171.124/Mem-PHybrid.