Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information

Institute of Microbial Technology, Sector 39A, Chandigarh, India.
BMC Bioinformatics (Impact Factor: 2.58). 01/2010; 11 Suppl 1(Suppl 1):S48. DOI: 10.1186/1471-2105-11-S1-S48
Source: DOAJ


Flavin binding proteins (FBP) plays a critical role in several biological functions such as electron transport system (ETS). These flavoproteins contain very tightly bound, sometimes covalently, flavin adenine dinucleotide (FAD) or flavin mono nucleotide (FMN). The interaction between flavin nucleotide and amino acids of flavoprotein is essential for their functionality. Thus identification of FAD interacting residues in a FBP is an important step for understanding their function and mechanism.
In this study, we describe models developed for predicting FAD interacting residues using 15, 17 and 19 window pattern. Support vector machine (SVM) based models have been developed using binary pattern of amino acid sequence of protein and achieved maximum accuracy 69.65% with Mathew's Correlation Coefficient (MCC) 0.39 and Area Under Curve (AUC) 0.773. The performance of these models have been improved significantly from 69.65% to 82.86% with MCC 0.66 and AUC 0.904, when evolutionary information is used as input in SVM. The evolutionary information was generated in form of position specific score matrix (PSSM) profile by using PSI-BLAST at e-value 0.001. All models were developed on 198 non-redundant FAD binding protein chains containing 5172 FAD interacting residues and evaluated using fivefold cross-validation technique.
This study suggests that evolutionary information of 17 amino acid patterns perform best for FAD interacting residues prediction. We also developed a web server which predicts FAD interacting residues in a protein which is freely available for academics.


Available from: Gajendra Pal Singh Raghava
  • Source
    • "For FADbinding proteins, our method yielded 85.68% sensitivity and an MCC of 0.75. These values compared with the performance of the published method (83.36% and 0.66) developed by Mishra and Raghava [11] (Table 4). Our method, therefore, has similar performance in NAD-binding sites predicted but better in FAD-binding sites. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We developed a computational method to identify NAD- and FAD-binding sites in proteins. First, we extracted from the Protein Data Bank structures of proteins that bind to at least one of these ligands. NAD-/FAD-binding residue templates were then constructed by identifying binding residues through the ligand-binding database BioLiP. The fragment transformation method was used to identify structures within query proteins that resembled the ligand-binding templates. By comparing residue types and their relative spatial positions, potential binding sites were identified and a ligand-binding potential for each residue was calculated. Setting the false positive rate at 5%, our method predicted NAD- and FAD-binding sites at true positive rates of 67.1% and 68.4%, respectively. Our method provides excellent results for identifying FAD- and NAD-binding sites in proteins, and the most important is that the requirement of conservation of residue types and local structures in the FAD- and NAD-binding sites can be verified.
    05/2015; 2015:1-13. DOI:10.1155/2015/402536
    • "A computational method for predicting the FMN-binding residues on proteins would greatly facilitate defining FMNbinding sites on protein structures. Computational methods have been developed to predict FMN (Wang et al., 2012), flavin adenine dinucleotide (FAD) (Mishra and Raghava, 2010) and nicotinamide adenine dinucleotide (NAD) (Ansari and Raghava, 2010) binding Contents lists available at ScienceDirect journal homepage: "
    [Show abstract] [Hide abstract]
    ABSTRACT: Flavin mono-nucleotide (FMN) is a cofactor which is involved in many biological reactions. The insights on protein-FMN interactions aid the protein functional annotation and also facilitate in drug design. In this study, we have established a new method, making use of an encoding scheme of the three-dimensional probability density maps that describe the distributions of 40 non-covalent interacting atom types around protein surfaces, to predict FMN-binding sites on protein surfaces. One machine learning model was trained for each of the 30 protein atom types to predict tentative FMN-binding sites on protein structures. The method's capability was evaluated by five-fold cross validation on a dataset containing 81 non-redundant FMN-binding protein structures and further tested on independent datasets of 30 and 15 non-redundant protein structures respectively. These predictions achieved an accuracy of 0.94, 0.94 and 0.96 with Matthews correlation coefficient (MCC) of 0.53, 0.53 and 0.65 respectively for the three protein structure sets. The prediction capability is superior to the existing method. This is the first structure-based approach that does not rely on evolutionary information for predicting FMN-interacting residues. The webserver for the prediction is available at
    Journal of Theoretical Biology 11/2013; 192. DOI:10.1016/j.jtbi.2013.10.020 · 2.12 Impact Factor
  • Source
    • "Several prediction methods have been developed for carbohydrates [32,33], lipids [34,35], DNA [29,36-39] and RNA [30,38,40] interacting residues in protein sequence. Some methods have been developed for specific ligands such as ATP [41,42], GTP [43], NAD [44], FAD [45] and mannose [46]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure. Results In this study, first we compared protein-interacting residues of vitamins with other ligands using Two Sample Logo (TSL). It was observed that ATP, GTP, NAD, FAD and mannose preferred {G,R,K,S,H}, {G,K,T,S,D,N}, {T,G,Y}, {G,Y,W} and {Y,D,W,N,E} residues respectively, whereas vitamins preferred {Y,F,S,W,T,G,H} residues for the interaction with proteins. Furthermore, compositional information of preferred and non-preferred residues along with patterns-specificity was also observed within different vitamin-classes. Vitamins A, B and B6 preferred {F,I,W,Y,L,V}, {S,Y,G,T,H,W,N,E} and {S,T,G,H,Y,N} interacting residues respectively. It suggested that protein-binding patterns of vitamins are different from other ligands, and motivated us to develop separate predictor for vitamins and their sub-classes. The four different prediction modules, (i) vitamin interacting residues (VIRs), (ii) vitamin-A interacting residues (VAIRs), (iii) vitamin-B interacting residues (VBIRs) and (iv) pyridoxal-5-phosphate (vitamin B6) interacting residues (PLPIRs) have been developed. We applied various classifiers of SVM, BayesNet, NaiveBayes, ComplementNaiveBayes, NaiveBayesMultinomial, RandomForest and IBk etc., as machine learning techniques, using binary and Position-Specific Scoring Matrix (PSSM) features of protein sequences. Finally, we selected best performing SVM modules and obtained highest MCC of 0.53, 0.48, 0.61, 0.81 for VIRs, VAIRs, VBIRs, PLPIRs respectively, using PSSM-based evolutionary information. All the modules developed in this study have been trained and tested on non-redundant datasets and evaluated using five-fold cross-validation technique. The performances were also evaluated on the balanced and different independent datasets. Conclusions This study demonstrates that it is possible to predict VIRs, VAIRs, VBIRs and PLPIRs from evolutionary information of protein sequence. In order to provide service to the scientific community, we have developed web-server and standalone software VitaPred (
    BMC Bioinformatics 02/2013; 14(1):44. DOI:10.1186/1471-2105-14-44 · 2.58 Impact Factor
Show more