A Graph-theory Algorithm for Rapid Protein Side-chain Prediction

Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA.
Protein Science (Impact Factor: 2.85). 10/2003; 12(9):2001-14. DOI: 10.1110/ps.03154503
Source: PubMed


Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.

Download full-text


Available from: Roland Dunbrack
  • Source
    • "Graphs are a powerful and natural way to represent complex data with integrated structure. Graphs have been used in numerous applications in a number of different fields, such as (i) computer vision [1] [2] and biomedical imaging analysis [3] [4] [5], (ii) bioinformatics [6] [7], (iii) social networks analysis [8] and (iv) chemoinformatics [9]. In many applications, the exploration of the data requires the ability to efficiently compare graphs and to provide a similarity measurement , a problem known as graph comparison. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Graphs are flexible and powerful representations for non-vectorial structured data. Graph kernels have been shown to enable efficient and accurate statistical learning on this important domain, but many graph kernel algorithms have high order polynomial time complexity. Efficient graph kernels rely on a discrete node labeling as a central assumption. However, many real world domains are naturally described by continuous or vector valued node labels. In this paper, we propose an efficient graph representation and comparison scheme for large graphs with continuous vector labels, the pyramid quantized Weisfeiler-Lehman graph representation. Our algorithm considers statistics of subtree patterns with discrete labels based on the Weisfeiler-Lehman algorithm and uses a pyramid quantization strategy to determine a logarithmic number of discrete labelings that results in a representation that guarantees a multiplicative error bound on an approximation to the optimal partial matching. As a result, we approximate a graph representation with continuous vector labels as a sequence of graphs with increasingly granular discrete labels. We evaluate our proposed algorithm on two different tasks with real datasets, on a fMRI analysis task and on the generic problem of 3D shape classification. Source code of the implementation can be downloaded from
    Preview · Article · Sep 2015 · Neurocomputing
  • Source
    • "Simulating the unbound structure from the bound one provides such an opportunity. Our earlier set of simulated unbound struc- tures[13], based on an older version of DOCKGROUND, was generated by changing the side chain conformations according to the rotamer library[16]. In the current paper we describe a much larger set obtained by Langevin dynamics simulation and based on a systematic analysis of the experimentally determined bound/unbound structural differences. "

    Preview · Article · Jul 2015 · BMC Bioinformatics
  • Source
    • "The wild-type and the quadruple mutant F68A/ F142A/F143A/F179A of HRPC were simulated. The structure of the quadruple mutant was modeled in a SCRWL 3.0 program based on the combination of rotamer libraries (Canutescu et al., 2003). The CHARMM force field was applied to the molecules, and the CDOCKER module was used for molecular docking simulation. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Peroxidases have great potential as industrial biocatalysts. In particular, the oxidative polymerization of phenolic compounds catalyzed by peroxidases has been extensively examined because of the advantage of this method over other conventional chemical methods. However, the industrial application of peroxidases is often limited because of their rapid inactivation by phenoxyl radicals during oxidative polymerization. In this work, we report a novel protein engineering approach to improve the radical stability of horseradish peroxidase isozyme C (HRPC). Phenylalanine residues that are vulnerable to modification by the phenoxyl radicals were identified using mass spectrometry analysis. UV-Vis and CD spectra showed that radical coupling did not change the secondary structure or the active site of HRPC. Four phenylalanine (Phe) residues (F68, F142, F143, and F179) were each mutated to alanine residues to generate single mutants to examine the role of these sites in radical coupling. Despite marginal improvement of radical stability, each single mutant still exhibited rapid radical inactivation. To further reduce inactivation by radical coupling, the four substitution mutations were combined in F68A/F142A/F143A/F179A. This mutant demonstrated dramatic enhancement of radical stability by retaining 41% of its initial activity compared to the wild-type, which was completely inactivated. Structure and sequence alignment revealed that radical vulnerable Phe residues of HPRC are conserved in homologous peroxidases, which showed the same rapid inactivation tendency as HRPC. Based on our site-directed mutagenesis and biochemical characterization, we have shown that engineering radical-vulnerable residues to eliminate multiple radical coupling can be a good strategy to improve the stability of peroxidases against radical attack. Biotechnol. Bioeng. © 2014 Wiley Periodicals, Inc.
    Full-text · Article · Apr 2015 · Biotechnology and Bioengineering
Show more