A Graph-theory Algorithm for Rapid Protein Side-chain Prediction

Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA.
Protein Science (Impact Factor: 2.85). 10/2003; 12(9):2001-14. DOI: 10.1110/ps.03154503
Source: PubMed


Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.

Download full-text


Available from: Roland Dunbrack, Oct 04, 2015
75 Reads
  • Source
    • "The wild-type and the quadruple mutant F68A/ F142A/F143A/F179A of HRPC were simulated. The structure of the quadruple mutant was modeled in a SCRWL 3.0 program based on the combination of rotamer libraries (Canutescu et al., 2003). The CHARMM force field was applied to the molecules, and the CDOCKER module was used for molecular docking simulation. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Peroxidases have great potential as industrial biocatalysts. In particular, the oxidative polymerization of phenolic compounds catalyzed by peroxidases has been extensively examined because of the advantage of this method over other conventional chemical methods. However, the industrial application of peroxidases is often limited because of their rapid inactivation by phenoxyl radicals during oxidative polymerization. In this work, we report a novel protein engineering approach to improve the radical stability of horseradish peroxidase isozyme C (HRPC). Phenylalanine residues that are vulnerable to modification by the phenoxyl radicals were identified using mass spectrometry analysis. UV-Vis and CD spectra showed that radical coupling did not change the secondary structure or the active site of HRPC. Four phenylalanine (Phe) residues (F68, F142, F143, and F179) were each mutated to alanine residues to generate single mutants to examine the role of these sites in radical coupling. Despite marginal improvement of radical stability, each single mutant still exhibited rapid radical inactivation. To further reduce inactivation by radical coupling, the four substitution mutations were combined in F68A/F142A/F143A/F179A. This mutant demonstrated dramatic enhancement of radical stability by retaining 41% of its initial activity compared to the wild-type, which was completely inactivated. Structure and sequence alignment revealed that radical vulnerable Phe residues of HPRC are conserved in homologous peroxidases, which showed the same rapid inactivation tendency as HRPC. Based on our site-directed mutagenesis and biochemical characterization, we have shown that engineering radical-vulnerable residues to eliminate multiple radical coupling can be a good strategy to improve the stability of peroxidases against radical attack. Biotechnol. Bioeng. © 2014 Wiley Periodicals, Inc.
    Biotechnology and Bioengineering 04/2015; 112(4). DOI:10.1002/bit.25483 · 4.13 Impact Factor
  • Source
    • "The targettemplate alignment of sequences is a crucial step in the modeling procedure and we manually inspected the alignment to ensure that there are no gaps in the transmembrane helical segments and the highly conserved residues in specific transmembrane segments are aligned in the same column. Among the 10 models generated for each sequence, the model with the optimal objective function was selected for further side-chain refinement using SCWRL3 package [46]. Only side-chains of those residues which are not conserved across the three template and the target sequences were modeled. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The superfamily of major intrinsic proteins (MIPs) includes aquaporin (AQP) and aquaglyceroporin (AQGP) and it is involved in the transport of water and neutral solutes across the membrane. Diverse MIP sequences adopt a unique hour-glass fold with six transmembrane helices (TM1 to TM6) and two half-helices (LB and LE). Loop E contains one of the two conserved NPA motifs and contributes two residues to the aromatic/arginine selectivity filter. Function and regulation of majority of MIP channels are not yet characterized. We have analyzed the loop E region of 1468 MIP sequences and their structural models from six different organism groups. They can be phylogenetically clustered into AQGPs, AQPs, plant MIPs and other MIPs. The LE half-helix in all AQGPs contains an intra-helical salt-bridge and helix-breaking residues Gly/Pro within the same helical turn. All non-AQGPs lack this salt-bridge but have the helix destabilizing Gly and/or Pro in the same positions. However, the segment connecting LE half-helix and TM6 is longer by 10-15 residues in AQGPs compared to all non-AQGPs. We speculate that this longer loop in AQGPs and the LE half-helix of non-AQGPs will be relatively more flexible and this could be functionally important. Molecular dynamics simulations on glycerol-specific GlpF, water-transporting AQP1, its mutant and a fungal AQP channel confirm these predictions. Thus two distinct regions of loop E, one in AQGPs and the other in non-AQGPs, seem to be capable of modulating the transport. These regions can also act in conjunction with other extracellular residues/segments to regulate MIP channel transport. Copyright © 2015. Published by Elsevier B.V.
    Biochimica et Biophysica Acta 03/2015; 1848(6). DOI:10.1016/j.bbamem.2015.03.013 · 4.66 Impact Factor
  • Source
    • "For scTIM, we retrieved the 3D structure from the RCSB Protein Data Bank [23] [PDB:2YPI] and downloaded the precomputed RIN from the RINdata web service [14]. Since there is no experimentally resolved protein structure of dTIM, we used the SCWRL Server [24] at BIC-JCSG with default settings and the parent structure as template to generate a three-dimensional model. A RIN for the defective mutant was created from the modeled structure by our RINerator package. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background An important aspect of studying the relationship between protein sequence, structure and function is the molecular characterization of the effect of protein mutations. To understand the functional impact of amino acid changes, the multiple biological properties of protein residues have to be considered together. Results Here, we present a novel visual approach for analyzing residue mutations. It combines different biological visualizations and integrates them with molecular data derived from external resources. To show various aspects of the biological information on different scales, our approach includes one-dimensional sequence views, three-dimensional protein structure views and two-dimensional views of residue interaction networks as well as aggregated views. The views are linked tightly and synchronized to reduce the cognitive load of the user when switching between them. In particular, the protein mutations are mapped onto the views together with further functional and structural information. We also assess the impact of individual amino acid changes by the detailed analysis and visualization of the involved residue interactions. We demonstrate the effectiveness of our approach and the developed software on the data provided for the BioVis 2013 data contest. Conclusions Our visual approach and software greatly facilitate the integrative and interactive analysis of protein mutations based on complementary visualizations. The different data views offered to the user are enriched with information about molecular properties of amino acid residues and further biological knowledge.
    BMC proceedings 08/2014; 8(Suppl 2 Proceedings of the 3rd Annual Symposium on Biologica):S2. DOI:10.1186/1753-6561-8-S2-S2
Show more