Leveraging enzyme structure-function relationships for functional inference and experimental design: the structure-function linkage database.

Department of Biopharmaceutical Sciences, University of California, San Francisco, 1700 Fourth Street, San Francisco, California 94143-2250, USA.
Biochemistry (Impact Factor: 3.38). 03/2006; 45(8):2545-55. DOI: 10.1021/bi052101l
Source: PubMed

ABSTRACT The study of mechanistically diverse enzyme superfamilies-collections of enzymes that perform different overall reactions but share both a common fold and a distinct mechanistic step performed by key conserved residues-helps elucidate the structure-function relationships of enzymes. We have developed a resource, the structure-function linkage database (SFLD), to analyze these structure-function relationships. Unique to the SFLD is its hierarchical classification scheme based on linking the specific partial reactions (or other chemical capabilities) that are conserved at the superfamily, subgroup, and family levels with the conserved structural elements that mediate them. We present the results of analyses using the SFLD in correcting misannotations, guiding protein engineering experiments, and elucidating the function of recently solved enzyme structures from the structural genomics initiative. The SFLD is freely accessible at

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The cytosolic glutathione transferase (cytGST) superfamily comprises more than 13,000 nonredundant sequences found throughout the biosphere. Their key roles in metabolism and defense against oxidative damage have led to thousands of studies over several decades. Despite this attention, little is known about the physiological reactions they catalyze and most of the substrates used to assay cytGSTs are synthetic compounds. A deeper understanding of relationships across the superfamily could provide new clues about their functions. To establish a foundation for expanded classification of cytGSTs, we generated similarity-based subgroupings for the entire superfamily. Using the resulting sequence similarity networks, we chose targets that broadly covered unknown functions and report here experimental results confirming GST-like activity for 82 of them, along with 37 new 3D structures determined for 27 targets. These new data, along with experimentally known GST reactions and structures reported in the literature, were painted onto the networks to generate a global view of their sequence-structure-function relationships. The results show how proteins of both known and unknown function relate to each other across the entire superfamily and reveal that the great majority of cytGSTs have not been experimentally characterized or annotated by canonical class. A mapping of taxonomic classes across the superfamily indicates that many taxa are represented in each subgroup and highlights challenges for classification of superfamily sequences into functionally relevant classes. Experimental determination of disulfide bond reductase activity in many diverse subgroups illustrate a theme common for many reaction types. Finally, sequence comparison between an enzyme that catalyzes a reductive dechlorination reaction relevant to bioremediation efforts with some of its closest homologs reveals differences among them likely to be associated with evolution of this unusual reaction. Interactive versions of the networks, associated with functional and other types of information, can be downloaded from the Structure-Function Linkage Database (SFLD;
    PLoS Biology 04/2014; 12(4):e1001843. · 12.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies.
    PLoS ONE 01/2014; 9(4):e91315. · 3.73 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Visceral Leishmaniasis (VL) is the most lethal form of Leishmaniasis caused by Leishmania donovani. This disease is the second largest parasitic killer in the world. Trypanothione reductase (TR) is an enzyme commonly present in all members of Trypanosomatidae, including Leishmania. This enzyme, analogous to Glutathione Reductase (GR) of mammals, is crucial for the management of oxidative stress of the parasite; as it recycles Trypanothione. The three-dimensional structure of TR from L. donovani (LdTR) has not been determined till date. In this study, the three-dimensional structure of LdTR was built by homology modelling and refined using molecular dynamics program. Various properties of the structural hierarchy of LdTR was also attempted to study along with the recognition and characterization of catalytic domains present in the enzyme. The main tools and servers used for the research were- MODELLER 9.11, VEGA ZZ 3.01, and MESSA etc. The results involved creation of an energy-minimized, refined comparative 3-D model of LdTR. The study also predicted LdTR to be a mitochondrial protein that uses FAD as an electron donor and is involved directly involved in homo-dimerization as well as indirectly involved in reduction of reactive oxygen species. The structure was submitted in PMDB (Protein Model Database).
    International Journal of Scientific and Engineering Research 01/2014; 5(1):490-495. · 1.40 Impact Factor

Full-text (2 Sources)

Available from
Jun 2, 2014