Protein Structure Similarity Clustering: Dynamic Treatment of PDB Structures Facilitates Clustering

Department of Chemistry, University of Nebraska at Lincoln, Lincoln, Nebraska, United States
Angewandte Chemie International Edition (Impact Factor: 11.34). 11/2006; 45(46):7766-70. DOI: 10.1002/anie.200602125
Source: PubMed

ABSTRACT In the family: The introduction of ligand docking, mol. dynamics, and the VAST algorithm into protein structure similarity clustering greatly streamlines the process and opens up otherwise unseen connections to new protein-cluster partners. VAST = vector alignment search tool. [on SciFinder(R)]


Available from: David B Berkowitz, May 07, 2015
1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: Protein sequence data is growing at an expo- nential rate. However a considerable portion of this data is redundant, with many new sequences being very similar to others in the databases. While clustering has been used to reduce this redundancy, the influence of sequence similarity in the functional quality of the clusters is still unclear. Results: In this work, we introduce a greedy graph-based clustering algorithm, which is tested using the Swiss-Prot database. We study the topology of the protein space as function of the threshold BLAST e-values, and the functional characterization of the clusters using the Gene Ontology. Initial results show that seemingly the cluster centers alone can capture a large portion of the information content of the database, therefore largely reducing its redundancy. Also it was found an expected increase of cluster functional coher- ence and characterization with the stringency of the thresh- old, as well as the amount of information captured by the cluster centers.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In silico searches of new drug candidates have been considered as a cost-effective alternative to experimental drug screening. The efficiency of in silico approaches relies on important assumptions regarding the target selection, the methods employed, the quality of the drug and target molecule structures, and the computing environment (CPU-limited workstations, grids). The use of in silico methods is sometimes the only possible strategy when the infectious agent cannot be propagated safely or with sufficient reproducibility in a laboratory environment, when the target proteins cannot be heterologously expressed for structural and activity analyses or when the genetic variability requires the specific definition of invariable protein domains as potential targets. When the structure of the target is not available, in silico ligand-based drug design may be the only possible alternative strategy to find novel bioactive molecules. The strengths and limitations of in silico strategies therefore rely on the accuracy of the prior knowledge and assumptions. Here, we focus on Plasmodial proteins as a case study, since these proteins can often not be expressed in recombinant systems and are therefore difficult to characterize structurally using traditional physical approaches. This chapter will (1) briefly address the question of target selection in the context of a parasitic infection such as malaria, (2) introduce the peculiarities of malaria proteins and detail some in silico approaches to describe molecular structures of malaria drug targets, and review subsequent rational approaches for (3) receptor-based and (4) ligand-based drug design. This review also presents the evolution of the computing infrastructures that make in silico experiments possible and particularly discusses the WISDOM initiative for grid-enabled drug discovery against neglected and emergent diseases.
    08/2009: pages 279-304;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Bioinformatics relies heavily on web resources for information gathering. Ontologies are being developed to fill the background knowledge needed to drive Semantic Web applications. This paper discusses how ontologies are not always suited for document navigation on the web. Converting ontologies into a model with looser semantics allows cheap and rapid generation of useful knowledge systems. The message is that ontologies are not the only knowledge artefact needed; vocabularies and other classification schemes with weaker semantics have their role and are the best solution in certain circumstances.
    The 10 th Annual Bio-Ontologies Meeting 2007, Co-located with ISMB/ECCB 2007, Vienna, Austria; 07/2007