Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information

Institute of Microbial Technology, Sector 39A, Chandigarh, India.
BMC Bioinformatics (Impact Factor: 2.67). 01/2010; 11 Suppl 1(Suppl 1):S48. DOI: 10.1186/1471-2105-11-S1-S48
Source: DOAJ

ABSTRACT Flavin binding proteins (FBP) plays a critical role in several biological functions such as electron transport system (ETS). These flavoproteins contain very tightly bound, sometimes covalently, flavin adenine dinucleotide (FAD) or flavin mono nucleotide (FMN). The interaction between flavin nucleotide and amino acids of flavoprotein is essential for their functionality. Thus identification of FAD interacting residues in a FBP is an important step for understanding their function and mechanism.
In this study, we describe models developed for predicting FAD interacting residues using 15, 17 and 19 window pattern. Support vector machine (SVM) based models have been developed using binary pattern of amino acid sequence of protein and achieved maximum accuracy 69.65% with Mathew's Correlation Coefficient (MCC) 0.39 and Area Under Curve (AUC) 0.773. The performance of these models have been improved significantly from 69.65% to 82.86% with MCC 0.66 and AUC 0.904, when evolutionary information is used as input in SVM. The evolutionary information was generated in form of position specific score matrix (PSSM) profile by using PSI-BLAST at e-value 0.001. All models were developed on 198 non-redundant FAD binding protein chains containing 5172 FAD interacting residues and evaluated using fivefold cross-validation technique.
This study suggests that evolutionary information of 17 amino acid patterns perform best for FAD interacting residues prediction. We also developed a web server which predicts FAD interacting residues in a protein which is freely available for academics.

Download full-text


Available from: Gajendra Pal Singh Raghava, Aug 20, 2015

Click to see the full-text of:

Article: Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information

398.45 KB

See full-text
1 Follower
    • "A computational method for predicting the FMN-binding residues on proteins would greatly facilitate defining FMNbinding sites on protein structures. Computational methods have been developed to predict FMN (Wang et al., 2012), flavin adenine dinucleotide (FAD) (Mishra and Raghava, 2010) and nicotinamide adenine dinucleotide (NAD) (Ansari and Raghava, 2010) binding Contents lists available at ScienceDirect journal homepage: "
    [Show abstract] [Hide abstract]
    ABSTRACT: Flavin mono-nucleotide (FMN) is a cofactor which is involved in many biological reactions. The insights on protein-FMN interactions aid the protein functional annotation and also facilitate in drug design. In this study, we have established a new method, making use of an encoding scheme of the three-dimensional probability density maps that describe the distributions of 40 non-covalent interacting atom types around protein surfaces, to predict FMN-binding sites on protein surfaces. One machine learning model was trained for each of the 30 protein atom types to predict tentative FMN-binding sites on protein structures. The method's capability was evaluated by five-fold cross validation on a dataset containing 81 non-redundant FMN-binding protein structures and further tested on independent datasets of 30 and 15 non-redundant protein structures respectively. These predictions achieved an accuracy of 0.94, 0.94 and 0.96 with Matthews correlation coefficient (MCC) of 0.53, 0.53 and 0.65 respectively for the three protein structure sets. The prediction capability is superior to the existing method. This is the first structure-based approach that does not rely on evolutionary information for predicting FMN-interacting residues. The webserver for the prediction is available at
    Journal of Theoretical Biology 11/2013; 192. DOI:10.1016/j.jtbi.2013.10.020 · 2.30 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Conotoxins are small disulfide-rich peptides that are invaluable channel-targeted peptides and target neuronal receptors. They show prospects for being potent pharmaceuticals in the treatment of Alzheimer's disease, Parkinson's disease, and epilepsy. Accurate and fast prediction of conotoxin superfamily is very helpful towards the understanding of its biological and pharmacological functions especially in the post-genomic era. In the present study, we have developed a novel approach called PredCSF for predicting the conotoxin superfamily from the amino acid sequence directly based on fusing different kinds of sequential features by using modified one-versus-rest SVMs. The input features to the PredCSF classifiers are composed of physicochemical properties, evolutionary information, predicted second structure and amino acid composition, where the most important features are further screened by random forest feature selection to improve the prediction performance. The prediction results show that PredCSF can obtain an overall accuracy of 90.65% based on a benchmark dataset constructed from the most recent database, which consists of 4 main conotoxin superfamilies and 1 class of non-conotoxin class. Systematic experiments also show that combing different features is helpful for enhancing the prediction power when dealing with complex biological problems. PredCSF is expected to be a powerful tool for in silico identification of novel conotonxins and is freely available for academic use at
    Protein and Peptide Letters 10/2010; 18(3):261-7. DOI:10.2174/092986611794578341 · 1.74 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Designing kinase inhibitors is always an area of interest because kinases are involved in many diseases. In the last one decade a large number of kinase inhibitors have been launched successfully; six inhibitors have been approved by FDA and more are under clinical trials. Cross-reactivity or off-target is one of the major problems in designing inhibitors against protein kinases; as human, have more than 500 kinases with high sequence similarity. In this study an attempt has been made to develop a model for predicting specificity and cross-reactivity of kinase inhibitors. The dataset used for testing and training consists of binding affinities of 20 chemical kinase inhibitors with protein kinases. We developed QSAR based SVM models for predicting binding affinity of an inhibitor against protein kinases using most relevant 5,10 and 15 structure descriptors and achieving average correlation of 0.64, 0.488 and 0.442 respectively. In order to predict specificity and cross-reactivity of an inhibitor, we developed 16 QSAR based SVM models for 16 protein kinases; one model for each kinase. We achieved average correlation 0.719 between actual and predicted binding affinity using kinase specific models. Based on the above study a web server DMKPred has been developed for predicting binding affinity of a drug molecule with 16 kinases. The SVM based model used in this study can be used to predict kinase specific inhibitors. This study will be useful for designing kinase specific inhibitors.
    Letters in Drug Design &amp Discovery 03/2011; 8(3-3):223-228. · 0.96 Impact Factor
Show more