Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for fast protein structure database search

Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan.
Genome biology (Impact Factor: 10.81). 02/2007; 8(3):R31. DOI: 10.1186/gb-2007-8-3-r31
Source: PubMed


We present a novel protein structure database search tool, 3D-BLAST, that is useful for analyzing novel structures and can return a ranked list of alignments. This tool has the features of BLAST (for example, robust statistical basis, and effective and reliable search capabilities) and employs a kappa-alpha (kappa, alpha) plot derived structural alphabet and a new substitution matrix. 3D-BLAST searches more than 12,000 protein structures in 1.2 s and yields good results in zones with low sequence similarity.

Download full-text


Available from: Jinn-Moon Yang, Jun 02, 2015
  • Source
    • "In line with this hypothesis, we developed a method that takes advantage of sequence similarity and structural related features [e.g. CS (Chothia and Lesk, 1987; Tramontano et al., 1990) as well as a score reflecting the likelihood of the presence/absence of specific interactions between the H3 residues and the rest of the modeled antibody structure (Tung et al., 2007)]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Antibodies are able to recognize a wide range of antigens through their complementary determining regions formed by six hypervariable loops. Predicting the 3D structure of these loops is essential for the analysis and reengineering of novel antibodies with enhanced affinity and specificity. The canonical structure model allows high accuracy prediction for five of the loops. The third loop of the heavy chain, H3, is the hardest to predict because of its diversity in structure, length and sequence composition.Results: We describe a method, based on the Random Forest automatic learning technique, to select structural templates for H3 loops among a dataset of candidates. These can be used to predict the structure of the loop with a higher accuracy than that achieved by any of the presently available methods. The method also has the advantage of being extremely fast and returning a reliable estimate of the model quality.Availability and implementation: The source code is freely available at anna.tramontano@uniroma1.itSupplementary Information: Supplementary data are available at Bioinformatics online.
    Full-text · Article · Jun 2014 · Bioinformatics
  • Source
    • "To find the candidates of the polypharmacological targets of a query protein, we firstly transformed the 3D protein structure of a query complex (i.e. protein and its binding ligand) into a 1D sequence with 23 states of a structural alphabet by using in-house tool 3D-BLAST (Figure 1A) [13,14]. We then identified the binding segments in the interface of the query complex. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
    Full-text · Article · Dec 2012 · BMC Genomics
  • Source
    • "Jung and Lee (2000) ProSup (Lackner et al., 2000) and TM-Align (Zhang and Skolnick, 2005) identify short seed fragments to give initial superpositions which are then optimised in similar ways. Yet other approaches define and match higher-order structural alphabets (Yang and Tung, 2006; Tyagi et al., 2007; Lo et al., 2007; Konagurthu et al., 2008; Stivala et al., 2009; Razmara et al., 2012), or fragments which might subsequently be re-assembled (Pandit and Skolnick, 2008; Budowski-Tal et al., 2010). Most algorithms treat proteins as rigidbody objects, but a few can take into account structural flexibility (Ye and Godzik, 2003; Salem et al., 2010), permutations of structural motifs (Szustakowski and Weng, 2000; Ilyin et al., 2004; Chen et al., 2006; Sabarinathan et al., 2010; Sippl and Wiederstein, 2012), and even composite alignments involving multiple chains (Sippl and Wiederstein, 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Aligning and comparing protein structures is important for understanding their evolutionary and functional relationships. With the rapid growth of protein structure databases in recent years, the need to align, superpose and compare protein structures rapidly and accurately has never been greater. Many structural alignment algorithms have been described in the past 20 years. However, achieving an algorithm that is both accurate and fast remains a considerable challenge. Results: We have developed a novel protein structure alignment algorithm called 'Kpax', which exploits the highly predictable covalent geometry of C(α) atoms to define multiple local coordinate frames in which backbone peptide fragments may be oriented and compared using sensitive Gaussian overlap scoring functions. A global alignment and hence a structural superposition may then be found rapidly using dynamic programming with secondary structure-specific gap penalties. When superposing pairs of structures, Kpax tends to give tighter secondary structure overlays than several popular structure alignment algorithms. When searching the CATH database, Kpax is faster and more accurate than the very efficient Yakusa algorithm, and it gives almost the same high level of fold recognition as TM-Align while being more than 100 times faster.
    Preview · Article · Oct 2012 · Bioinformatics
Show more