Integrating Diverse Data for Structure Determination of Macromolecular Assemblies

Department of Biopharmaceutical Sciences, and California Institute for Quantitative Biosciences, University of California at San Francisco, CA 94158-2330, USA.
Annual Review of Biochemistry (Impact Factor: 26.53). 08/2008; 77(1):443-77. DOI: 10.1146/annurev.biochem.77.060407.135530
Source: PubMed

ABSTRACT To understand the cell, we need to determine the macromolecular assembly structures, which may consist of tens to hundreds of components. First, we review the varied experimental data that characterize the assemblies at several levels of resolution. We then describe computational methods for generating the structures using these data. To maximize completeness, resolution, accuracy, precision, and efficiency of the structure determination, a computational approach is required that uses spatial information from a variety of experimental methods. We propose such an approach, defined by its three main components: a hierarchical representation of the assembly, a scoring function consisting of spatial restraints derived from experimental data, and an optimization method that generates structures consistent with the data. This approach is illustrated by determining the configuration of the 456 proteins in the nuclear pore complex (NPC) from baker's yeast. With these tools, we are poised to integrate structural information gathered at multiple levels of the biological hierarchy--from atoms to cells--into a common framework.

Download full-text


Available from: Dmitry Korkin, Aug 22, 2015
    • "The molecular architecture of the large subunit of the mammalian mitochondrial ribosome (39S) was determined with a 4.9-A ˚ resolution cryo-EM map and 70 inter-protein crosslinks (Ward et al., 2013). The molecular architecture of the RNA polymerase II transcription pre-initiation complex was determined with a 16-A ˚ resolution cryo-EM map plus 157 intra-protein and 109 inter-protein crosslinks (Alber et al., 2008). The atomic model of type III secretion system needle was determined with a 19.5-A ˚ resolution cryo-EM map and solid-state nuclear magnetic resonance (NMR) data (Loquet et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Structures of biomolecular systems are increasingly computed by integrative modeling that relies on varied types of experimental data and theoretical information. We describe here the proceedings and conclusions from the first wwPDB Hybrid/Integrative Methods Task Force Workshop held at the European Bioinformatics Institute in Hinxton, UK, on October 6 and 7, 2014. At the workshop, experts in various experimental fields of structural biology, experts in integrative modeling and visualization, and experts in data archiving addressed a series of questions central to the future of structural biology. How should integrative models be represented? How should the data and integrative models be validated? What data should be archived? How should the data and models be archived? What information should accompany the publication of integrative models? Copyright © 2015 Elsevier Ltd. All rights reserved.
    Structure 06/2015; 23(7). DOI:10.1016/j.str.2015.05.013 · 6.79 Impact Factor
  • Source
    • "In the case of models deposited in the PDB for which SAS data have made an essential contribution to the final result, such as in combined NMR-SAS structural refinement , the SAS data need to be made available. Combinations of methods are increasingly being used to study biomolecular structures, especially as we strive to define more complex assemblies such as molecular machines or even cellular components (Alber et al., 2008 "
    [Show abstract] [Hide abstract]
    ABSTRACT: This report presents the conclusions of the July 12-13, 2012 meeting of the Small-Angle Scattering Task Force of the worldwide Protein Data Bank (wwPDB; Berman et al., 2003) at Rutgers University in New Brunswick, New Jersey. The task force includes experts in small-angle scattering (SAS), crystallography, data archiving, and molecular modeling who met to consider questions regarding the contributions of SAS to modern structural biology. Recognizing there is a rapidly growing community of structural biology researchers acquiring and interpreting SAS data in terms of increasingly sophisticated molecular models, the task force recommends that (1) a global repository is needed that holds standard format X-ray and neutron SAS data that is searchable and freely accessible for download; (2) a standard dictionary is required for definitions of terms for data collection and for managing the SAS data repository; (3) options should be provided for including in the repository SAS-derived shape and atomistic models based on rigid-body refinement against SAS data along with specific information regarding the uniqueness and uncertainty of the model, and the protocol used to obtain it; (4) criteria need to be agreed upon for assessment of the quality of deposited SAS data and the accuracy of SAS-derived models, and the extent to which a given model fits the SAS data; (5) with the increasing diversity of structural biology data and models being generated, archiving options for models derived from diverse data will be required; and (6) thought leaders from the various structural biology disciplines should jointly define what to archive in the PDB and what complementary archives might be needed, taking into account both scientific needs and funding.
    Structure 06/2013; 21(6):875-81. DOI:10.1016/j.str.2013.04.020 · 6.79 Impact Factor
  • Source
    • "All those data are being integrated with computational approaches for the modelling of large and complex systems (Melquiond et al. 2012). Examples of such integrative approaches are HADDOCK (Dominguez et al. 2003; de Vries et al. 2010), an information-driven docking program that can incorporate various sources of experimental data and bioinformatics prediction to model macromolecular assemblies, and the Integrative Modeling Platform (IMP) which was designed for the modeling of macromolecular assemblies (Alber et al. 2008). With these aspects in mind, the ability to directly relate structural information gathered by other biophysical methods to NMR parameters opens up new avenues to speed up different steps of an NMR-based analysis of molecular structure and motion. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a computational environment for Fast Analysis of multidimensional NMR DAta Sets (FANDAS) that allows assembling multidimensional data sets from a variety of input parameters and facilitates comparing and modifying such "in silico" data sets during the various stages of the NMR data analysis. The input parameters can vary from (partial) NMR assignments directly obtained from experiments to values retrieved from in silico prediction programs. The resulting predicted data sets enable a rapid evaluation of sample labeling in light of spectral resolution and structural content, using standard NMR software such as Sparky. In addition, direct comparison to experimental data sets can be used to validate NMR assignments, distinguish different molecular components, refine structural models or other parameters derived from NMR data. The method is demonstrated in the context of solid-state NMR data obtained for the cyclic nucleotide binding domain of a bacterial cyclic nucleotide-gated channel and on membrane-embedded sensory rhodopsin II. FANDAS is freely available as web portal under WeNMR ( ).
    Journal of Biomolecular NMR 11/2012; 54(4). DOI:10.1007/s10858-012-9681-y · 3.31 Impact Factor
Show more