Article

PISCES: recent improvements to a PDB sequence culling server

Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA.
Nucleic Acids Research (Impact Factor: 8.81). 08/2005; 33(Web Server issue):W94-8. DOI: 10.1093/nar/gki402
Source: PubMed

ABSTRACT PISCES is a database server for producing lists of sequences from the Protein Data Bank (PDB) using a number of entry- and chain-specific criteria and mutual sequence identity. Our goal in culling the PDB is to provide the longest list possible of the highest resolution structures that fulfill the sequence identity and structural quality cut-offs. The new PISCES server uses a combination of PSI-BLAST and structure-based alignments to determine sequence identities. Structure alignment produces more complete alignments and therefore more accurate sequence identities than PSI-BLAST. PISCES now allows a user to cull the PDB by-entry in addition to the standard culling by individual chains. In this scenario, a list will contain only entries that do not have a chain that has a sequence identity to any chain in any other entry in the list over the sequence identity cut-off. PISCES also provides fully annotated sequences including gene name and species. The server allows a user to cull an input list of entries or chains, so that other criteria, such as function, can be used. Results from a search on the re-engineered RCSB's site for the PDB can be entered into the PISCES server by a single click, combining the powerful searching abilities of the PDB with PISCES's utilities for sequence culling. The server's data are updated weekly. The server is available at http://dunbrack.fccc.edu/pisces.

Full-text

Available from: Roland Dunbrack, Jun 06, 2015
0 Followers
 · 
102 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Tuberculosis kills more than a million people annually and presents increasingly high levels of resistance against current first line drugs. Structural information about Mycobacterium tuberculosis (Mtb) proteins is a valuable asset for the development of novel drugs and for understanding the biology of the bacterium; however, only about 10% of the ∼4000 proteins have had their structures determined experimentally. The CHOPIN database assigns structural domains and generates homology models for 2911 sequences, corresponding to ∼73% of the proteome. A sophisticated pipeline allows multiple models to be created using conformational states characteristic of different oligomeric states and ligand binding, such that the models reflect various functional states of the proteins. Additionally, CHOPIN includes structural analyses of mutations potentially associated with drug resistance. Results are made available at the web interface, which also serves as an automatically updated repository of all published Mtb experimental structures. Its RESTful interface allows direct and flexible access to structures and metadata via intuitive URLs, enabling easy programmatic use of the models. Database URL: http://structure.bioc.cam.ac.uk/chopin. © The Author(s) 2015. Published by Oxford University Press.
    Database The Journal of Biological Databases and Curation 01/2015; 2015. DOI:10.1093/database/bav026 · 4.46 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Interactions between proteins and DNA play an important role in many essential biological processes such as DNA replication, transcription, splicing, and repair. The identification of amino acid residues involved in DNA-binding sites is critical for understanding the mechanism of these biological activities. In the last decade, numerous computational approaches have been developed to predict protein DNA-binding sites based on protein sequence and/or structural information, which play an important role in complementing experimental strategies. At this time, approaches can be divided into three categories: sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modeling and threading. In this article, we review existing research on computational methods to predict protein DNA-binding sites, which includes data sets, various residue sequence/structural features, machine learning methods for comparison and selection, evaluation methods, performance comparison of different tools, and future directions in protein DNA-binding site prediction. In particular, we detail the meta-analysis of protein DNA-binding sites. We also propose specific implications that are likely to result in novel prediction methods, increased performance, or practical applications.
    International Journal of Molecular Sciences 03/2015; 16(3):5194-5215. DOI:10.3390/ijms16035194 · 2.46 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Interaction sites on protein surfaces mediate virtually all biological activities, and their identification holds promise for disease treatment and drug design. Novel algorithmic approaches for the prediction of these sites have been produced at a rapid rate, and the field has seen significant advancement over the past decade. However, the most current methods have not yet been reviewed in a systematic and comprehensive fashion. Herein, we describe the intricacies of the biological theory, datasets, and features required for modern protein-protein interaction site (PPIS) prediction, and present an integrative analysis of the state-of-the-art algorithms and their performance. First, the major sources of data used by predictors are reviewed, including training sets, evaluation sets, and methods for their procurement. Then, the features employed and their importance in the biological characterization of PPISs are explored. This is followed by a discussion of the methodologies adopted in contemporary prediction programs, as well as their relative performance on the datasets most recently used for evaluation. In addition, the potential utility that PPIS identification holds for rational drug design, hotspot prediction, and computational molecular docking is described. Finally, an analysis of the most promising areas for future development of the field is presented.
    Algorithms for Molecular Biology 12/2015; 10(1):7. DOI:10.1186/s13015-015-0033-9 · 1.86 Impact Factor