SCWRL and MolIDE: Computer programs for side-chain conformation prediction and homology modeling

Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, PA 19111, USA.
Nature Protocol (Impact Factor: 8.36). 02/2008; 3(12):1832-47. DOI: 10.1038/nprot.2008.184
Source: PubMed

ABSTRACT SCWRL and MolIDE are software applications for prediction of protein structures. SCWRL is designed specifically for the task of prediction of side-chain conformations given a fixed backbone usually obtained from an experimental structure determined by X-ray crystallography or NMR. SCWRL is a command-line program that typically runs in a few seconds. MolIDE provides a graphical interface for basic comparative (homology) modeling using SCWRL and other programs. MolIDE takes an input target sequence and uses PSI-BLAST to identify and align templates for comparative modeling of the target. The sequence alignment to any template can be manually modified within a graphical window of the target-template alignment and visualization of the alignment on the template structure. MolIDE builds the model of the target structure on the basis of the template backbone, predicted side-chain conformations with SCWRL and a loop-modeling program for insertion-deletion regions with user-selected sequence segments. SCWRL and MolIDE can be obtained at (

Download full-text


Available from: Roland Dunbrack, Sep 01, 2015
1 Follower
  • Source
    • "The most reliable models were evaluated on the basis of root mean square deviation (RMSD), TM score and DOPE Profile. The selected models were further refined using SCWRL 4.0 (Wang et al., 2008) and CHARMm (Vanommeslaeghe et al., 2010) energy minimization using ChiRotor algorithm of DS. The GROMOS (van Gunsteren et al., 1996) algorithm implemented in DeepView (Kaplan and Littlejohn, 2001) was used for energy minimization of the predicted chitinase II structure. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Thermomyces lanuginosus is a thermophilic fungus that produces large number of industrially significant enzymes owing to their inherent stability at high temperatures and wide range of pH optima, including thermostable chitinases that have not been fully characterized. Here, we report cloning, characterization and structure prediction of a gene encoding thermostable chitinase II. Sequence analysis revealed that chitinase II gene encodes a 343 amino acid protein of molecular weight 36.65 kDa. Our study reports that chitinase II exhibits a well-defined TIM-barrel topology with an eight-stranded �/ domain. Structural analysis and molecular docking studies suggested that Glu176 is essential for enzyme activity. Folding studies of chitinase II using molecular dynamics simulations clearly demonstrated that the stability of the protein was evenly distributed at 350 K.
    Journal of Theoretical Biology 04/2015; 374. DOI:10.1016/j.jtbi.2015.03.035 · 2.30 Impact Factor
  • Source
    • "The XML files from the SIFTS database (Velankar et al., 2005) were used to find the residue correspondence between the UniProt and PDB sequences. For each unique PDB sequence, we used one iteration of our modified PSI-BLAST (Altschul et al., 1997) from MolIDE (Wang et al., 2008) to generate a profile from sequences in the UniRef90 database (Li et al., 2000). The parameters for PSI-BLAST were '-e 10 -h 0.0001 -v 5000 -b 5000 -N 25 -f 16'. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM-HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. The Pfam assignment data in PDBfam are available at, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly.
    Bioinformatics 08/2012; 28(21):2763-72. DOI:10.1093/bioinformatics/bts533 · 4.62 Impact Factor
  • Source
    • "In the past three decades there have been lots of studies in each direction. Different kinds of energy functions have been tried and developed [4-10]. In the domain of search strategy, a broad range of combinatorial search algorithms, both exact [11-15] and approximate [16-20] ones, have been applied. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.
    BMC Bioinformatics 12/2011; 12 Suppl 14(Suppl 14):S10. DOI:10.1186/1471-2105-12-S14-S10 · 2.67 Impact Factor
Show more