Integrating Diverse Data for Structure Determination of Macromolecular Assemblies

Department of Biopharmaceutical Sciences, and California Institute for Quantitative Biosciences, University of California at San Francisco, CA 94158-2330, USA.
Annual Review of Biochemistry (Impact Factor: 30.28). 08/2008; 77(1):443-77. DOI: 10.1146/annurev.biochem.77.060407.135530
Source: PubMed


To understand the cell, we need to determine the macromolecular assembly structures, which may consist of tens to hundreds of components. First, we review the varied experimental data that characterize the assemblies at several levels of resolution. We then describe computational methods for generating the structures using these data. To maximize completeness, resolution, accuracy, precision, and efficiency of the structure determination, a computational approach is required that uses spatial information from a variety of experimental methods. We propose such an approach, defined by its three main components: a hierarchical representation of the assembly, a scoring function consisting of spatial restraints derived from experimental data, and an optimization method that generates structures consistent with the data. This approach is illustrated by determining the configuration of the 456 proteins in the nuclear pore complex (NPC) from baker's yeast. With these tools, we are poised to integrate structural information gathered at multiple levels of the biological hierarchy--from atoms to cells--into a common framework.

Download full-text


Available from: Dmitry Korkin,
15 Reads
    • "The molecular architecture of the large subunit of the mammalian mitochondrial ribosome (39S) was determined with a 4.9-A ˚ resolution cryo-EM map and 70 inter-protein crosslinks (Ward et al., 2013). The molecular architecture of the RNA polymerase II transcription pre-initiation complex was determined with a 16-A ˚ resolution cryo-EM map plus 157 intra-protein and 109 inter-protein crosslinks (Alber et al., 2008). The atomic model of type III secretion system needle was determined with a 19.5-A ˚ resolution cryo-EM map and solid-state nuclear magnetic resonance (NMR) data (Loquet et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Structures of biomolecular systems are increasingly computed by integrative modeling that relies on varied types of experimental data and theoretical information. We describe here the proceedings and conclusions from the first wwPDB Hybrid/Integrative Methods Task Force Workshop held at the European Bioinformatics Institute in Hinxton, UK, on October 6 and 7, 2014. At the workshop, experts in various experimental fields of structural biology, experts in integrative modeling and visualization, and experts in data archiving addressed a series of questions central to the future of structural biology. How should integrative models be represented? How should the data and integrative models be validated? What data should be archived? How should the data and models be archived? What information should accompany the publication of integrative models? Copyright © 2015 Elsevier Ltd. All rights reserved.
    Structure 06/2015; 23(7). DOI:10.1016/j.str.2015.05.013 · 5.62 Impact Factor
    • "Furthermore, they usually do not actively use additional orthogonal information that may be available , such as mutagenesis or mass spectrometry cross-link data. Only a few approaches have been published that can incorporate a variety of data (Alber et al., 2008), one of which is the Integrative Modeling Platform (IMP) developed by the Sali group, which has the capability of integrating cryo-EM data among others (Topf et al., 2008; Schneidman-Duhovny et al., 2012; Velá zquez-Muriel et al., 2012). Another approach is our in-house data-driven docking software HADDOCK (Dominguez et al., 2003; De Vries et al., 2010a), which is already capable of actively using information from various sources, such as mutagenesis, NMR H/D exchange and cross-links data, to name only a few. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein-protein interactions play a central role in all cellular processes. Insight into their atomic architecture is therefore of paramount importance. Cryo-electron microscopy (cryo-EM) is capable of directly imaging large macromolecular complexes. Unfortunately, the resolution is usually not sufficient for a direct atomic interpretation. To overcome this, cryo-EM data are often combined with high-resolution atomic structures. However, current computational approaches typically do not include information from other experimental sources nor a proper physico-chemical description of the interfaces. Here we describe the integration of cryo-EM data into our data-driven docking program HADDOCK and its performance on a benchmark of 17 complexes. The approach is demonstrated on five systems using experimental cryo-EM data in the range of 8.5-21 Å resolution. For several cases, cryo-EM data are integrated with additional interface information, e.g. mutagenesis and hydroxyl radical footprinting data. The resulting models have high-quality interfaces, revealing novel details of the interactions. Copyright © 2015 Elsevier Ltd. All rights reserved.
    Structure 04/2015; 23(5). DOI:10.1016/j.str.2015.03.014 · 5.62 Impact Factor
  • Source
    • "Still, it is important to point out that, while producing an X-ray structure of a macromolecular assembly is challenging, low-resolution data are usually more accessible. While this structural information can be used to filter or re-rank the models produced by ab initio methods, it can also be used to directly guide the assembly process by providing geometric restraints that the final assembly should respect (Lensink and Wodak, 2010; Alber et al., 2008). The advantage of such an approach is that the search space can be greatly reduced, as a consequence reducing the computational effort needed to explore the assembly's conformational space. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Proteins often assemble in multimeric complexes to perform a specific biologic function. However, trapping these high-order conformations is difficult experimentally. Therefore, predicting how proteins assemble using in silico techniques can be of great help. The size of the associated conformational space and the fact that proteins are intrinsically flexible structures make this optimization problem extremely challenging. Nonetheless, known experimental spatial restraints can guide the search process, contributing to model biologically relevant states. We present here a swarm intelligence optimization protocol able to predict the arrangement of protein symmetric assemblies by exploiting a limited amount of experimental restraints and steric interactions. Importantly, within this scheme the native flexibility of each protein subunit is taken into account as extracted from molecular dynamics (MD) simulations. We show that this is a key ingredient for the prediction of biologically functional assemblies when, upon oligomerization, subunits explore activated states undergoing significant conformational changes.
    Structure 06/2013; 21(7). DOI:10.1016/j.str.2013.05.014 · 5.62 Impact Factor
Show more