Article

Alignment of protein sequences by their profiles.

Mission Bay Genentech Hall, University of California, San Francisco, San Francisco, CA 94143, USA.
Protein Science (Impact Factor: 2.86). 05/2004; 13(4):1071-87. DOI: 10.1110/ps.03379804
Source: PubMed

ABSTRACT The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thirteen different protocols for creating and comparing profiles corresponding to the multiple sequence alignments are implemented in the SALIGN command of MODELLER. A test set of 200 pairwise, structure-based alignments with sequence identities below 40% is used to benchmark the 13 protocols as well as a number of previously described sequence alignment methods, including heuristic pairwise sequence alignment by BLAST, pairwise sequence alignment by global dynamic programming with an affine gap penalty function by the ALIGN command of MODELLER, sequence-profile alignment by PSI-BLAST, Hidden Markov Model methods implemented in SAM and LOBSTER, pairwise sequence alignment relying on predicted local structure by SEA, and multiple sequence alignment by CLUSTALW and COMPASS. The alignment accuracies of the best new protocols were significantly better than those of the other tested methods. For example, the fraction of the correctly aligned residues relative to the structure-based alignment by the best protocol is 56%, which can be compared with the accuracies of 26%, 42%, 43%, 48%, 50%, 49%, 43%, and 43% for the other methods, respectively. The new method is currently applied to large-scale comparative protein structure modeling of all known sequences.

0 Bookmarks
 · 
126 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.
    PLoS ONE 11/2014; 9(11):e113811. · 3.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Publisher’s description: Volume one of this two volume sequence focuses on the basic characterization of known protein structures as well as structure prediction from protein sequence information. The 11 chapters provide an overview of the field, covering key topics in modeling, force fields, classification, computational methods, and struture prediction. Each chapter is a self contained review designed to cover (1) definition of the problem and an historical perspective, (2) mathematical or computational formulation of the problem, (3) computational methods and algorithms, (4) performance results, (5) existing software packages, and (6) strengths, pitfalls, challenges, and future research directions. Table of contents: Modeling protein structures. Empirical force fields. Knowledge-based energy functions for computational studies of proteins.
  • [Show abstract] [Hide abstract]
    ABSTRACT: A long standing problem in structural bioinformatics is to determine the three-dimensional (3-D) structure of a protein when only a sequence of amino acid residues is given. Many computational methodologies and algorithms have been proposed as a solution to the 3-D Protein Structure Prediction (3-D-PSP) problem. These methods can be divided in four main classes: (a) first principle methods without database information; (b) first principle methods with database information; (c) fold recognition and threading methods; and (d) comparative modeling methods and sequence alignment strategies. Deterministic computational techniques, optimization techniques, data mining and machine learning approaches are typically used in the construction of computational solutions for the PSP problem. Our main goal with this work is to review the methods and computational strategies that are currently used in 3-D protein prediction. Copyright © 2014 Elsevier Ltd. All rights reserved.
    Computational Biology and Chemistry 10/2014; · 1.60 Impact Factor

Full-text (2 Sources)

Download
45 Downloads
Available from
May 21, 2014