Prediction of Metal Ion–Binding Sites in Proteins Using the Fragment Transformation Method

Graduate Institute of Molecular Systems Biomedicine, China Medical University, Taichung, Taiwan.
PLoS ONE (Impact Factor: 3.53). 06/2012; 7(6):e39252. DOI: 10.1371/journal.pone.0039252
Source: PubMed

ABSTRACT The structure of a protein determines its function and its interactions with other factors. Regions of proteins that interact with ligands, substrates, and/or other proteins, tend to be conserved both in sequence and structure, and the residues involved are usually in close spatial proximity. More than 70,000 protein structures are currently found in the Protein Data Bank, and approximately one-third contain metal ions essential for function. Identifying and characterizing metal ion-binding sites experimentally is time-consuming and costly. Many computational methods have been developed to identify metal ion-binding sites, and most use only sequence information. For the work reported herein, we developed a method that uses sequence and structural information to predict the residues in metal ion-binding sites. Six types of metal ion-binding templates- those involving Ca(2+), Cu(2+), Fe(3+), Mg(2+), Mn(2+), and Zn(2+)-were constructed using the residues within 3.5 Å of the center of the metal ion. Using the fragment transformation method, we then compared known metal ion-binding sites with the templates to assess the accuracy of our method. Our method achieved an overall 94.6 % accuracy with a true positive rate of 60.5 % at a 5 % false positive rate and therefore constitutes a significant improvement in metal-binding site prediction.

1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Approximately one third of proteins bind metal ions for stability and/or enzymatic function. However, on a structural level, only a small fraction of binding sites have been resolved. Metal binding site predictions can serve as a first step in putative function assignment for many unannotated proteins. Sequence based and structure based methods for metal binding site predictions are reviewed here. The CHED and SeqCHED methods of prediction from apo protein structures and translated gene sequences, respectively, are described in detail, including their web server applications. The relevance of SeqCHED to the analysis of single nucleotide polymorphisms (SNPs) associated with disease related metal binding sites is illustrated.
    Israel Journal of Chemistry (Online) 04/2013; 53(3‐4). DOI:10.1002/ijch.201200084 · 2.56 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The survival of Mycobacterium tuberculosis depends on mycolic acids - very long α-alkyl-β-hydroxy fatty acids comprising 60 to 90 carbon atoms. However, despite considerable efforts, little is known about how enzymes involved in mycolic acid biosynthesis recognize and bind their hydrophobic fatty acyl substrates. The condensing enzyme KasA is pivotal for the synthesis of very long (C38-42) fatty acids, the precursors of mycolic acids. To probe the mechanism of substrate and inhibitor recognition by KasA, we determined the structure of this protein in complex with a mycobacterial phospholipid, and with several thiolactomycin derivatives that were designed as substrate analogs. Our structures provide consecutive snapshots along the reaction coordinate for the enzyme-catalyzed reaction, and support an induced-fit mechanism in which a wide cavity is established through the concerted opening of three gatekeeping residues and several α-helices. The stepwise characterization of the binding process provides mechanistic insights into the induced-fit recognition in this system and serves as an excellent foundation for the development of high affinity KasA inhibitors.
    Journal of Biological Chemistry 10/2013; 288(47). DOI:10.1074/jbc.M113.511436 · 4.60 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Accurately identifying the protein-ligand binding sites or pockets is of significant importance for both protein function analysis and drug design. Although much progress has been made, challenges remain, especially when the 3D structures of target proteins are not available or no homology templates can be found in the library, where the template-based methods are hard to be applied. In this paper, we report a new ligand-specific template-free predictor called TargetS for targeting protein-ligand binding sites from primary sequences. TargetS first predicts the binding residues along the sequence with ligand-specific strategy and then further identifies the binding sites from the predicted binding residues through a recursive spatial clustering algorithm. Protein evolutionary information, predicted protein secondary structure, and ligand-specific binding propensities of residues are combined to construct discriminative features; an improved AdaBoost classifier ensemble scheme based on random undersampling is proposed to deal with the serious imbalance problem between positive (binding) and negative (nonbinding) samples. Experimental results demonstrate that TargetS achieves high performances and outperforms many existing predictors. TargetS web server and data sets are freely available at: for academic use.
    IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM 07/2013; 10(4):994-1008. DOI:10.1109/TCBB.2013.104 · 1.54 Impact Factor

Full-text (2 Sources)

Available from
May 29, 2014