MOTIF-EM: an automated computational tool for identifying conserved regions in CryoEM structures

NIH Center for Biomedical Computation, Stanford University, Stanford, CA 94305, USA.
Bioinformatics (Impact Factor: 4.62). 06/2010; 26(12):i301-9. DOI: 10.1093/bioinformatics/btq195
Source: PubMed

ABSTRACT We present a new, first-of-its-kind, fully automated computational tool MOTIF-EM for identifying regions or domains or motifs in cryoEM maps of large macromolecular assemblies (such as chaperonins, viruses, etc.) that remain conformationally conserved. As a by-product, regions in structures that are not conserved are revealed: this can indicate local molecular flexibility related to biological activity. MOTIF-EM takes cryoEM volumetric maps as inputs. The technique used by MOTIF-EM to detect conserved sub-structures is inspired by a recent breakthrough in 2D object recognition. The technique works by constructing rotationally invariant, low-dimensional representations of local regions in the input cryoEM maps. Correspondences are established between the reduced representations (by comparing them using a simple metric) across the input maps. The correspondences are clustered using hash tables and graph theory is used to retrieve conserved structural domains or motifs. MOTIF-EM has been used to extract conserved domains occurring in large macromolecular assembly maps, including as those of viruses P22 and epsilon 15, Ribosome 70S, GroEL, that remain structurally conserved in different functional states. Our method can also been used to build atomic models for some maps. We also used MOTIF-EM to identify the conserved folds shared among dsDNA bacteriophages HK97, Epsilon 15, and ô29, though they have low-sequence similarity.
Supplementary information: Supplementary data are available at Bioinformatics online.

1 Follower
  • [Show abstract] [Hide abstract]
    ABSTRACT: MOTIVATION: Due to the size and complexity of large multi-component biological assemblies, the most tractable approach to determining their atomic structure is often to fit high-resolution X-ray or NMR structures of isolated components into lower resolution electron density maps of the larger assembly obtained via cryo-electron microscopy. This hybrid approach to structure determination requires that an atomic resolution structure of each component, or a suitable homolog, is available. If neither is available, then the amount of structural information regarding that component is limited by the resolution of the cryo-EM map. However, even if a suitable homolog cannot be identified via sequence analysis, a search for structural homologs should still be performed since structural homology often persists throughout evolution even when sequence homology is undetectable, Since macromolecules can often be described as a collection of independently folded domains, one way of searching for structural homologs would be to systematically fit representative domain structures from a protein domain database into the medium/low resolution cryo-EM map and return the best fits. Taken together, the best fitting non-overlapping structures would constitute a "mosaic" backbone model of the assembly that could aid map interpretation and illuminate biological function. RESULT: Utilizing the computational principles of the Scale Invariant Feature Transform (SIFT) we have developed FOLD-EM - a computational tool that can identify folded macromolecular domains in medium to low resolution (4-15 Å) electron density maps and return a model of the constituent polypeptides in a fully automated fashion. As a by-product, FOLD-EM can also do flexible multi-domain fitting that may provide insight into conformational changes that occur in macromolecular assemblies.Availability and Implementation: FOLD-EM is available, at:, as a free open source software to the structural biology scientific community. CONTACT:
    Bioinformatics 11/2012; 28(24). DOI:10.1093/bioinformatics/bts616 · 4.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Functional rearrangements in biomolecular assemblies result from diffusion across an underlying energy landscape. While bulk kinetic measurements rely on discrete state-like approximations to the energy landscape, single-molecule methods can project the free energy onto specific coordinates. With measures of the diffusion, one may establish a quantitative bridge between state-like kinetic measurements and the continuous energy landscape. We used an all-atom molecular dynamics simulation of the 70S ribosome (2.1 million atoms; 1.3 microseconds) to provide this bridge for specific conformational events associated with the process of tRNA translocation. Starting from a pre-translocation configuration, we identified sets of residues that collectively undergo rotary rearrangements implicated in ribosome function. Estimates of the diffusion coefficients along these collective coordinates for translocation were then used to interconvert between experimental rates and measures of the energy landscape. This analysis, in conjunction with previously reported experimental rates of translocation, provides an upper-bound estimate of the free-energy barriers associated with translocation. While this analysis was performed for a particular kinetic scheme of translocation, the quantitative framework is general and may be applied to energetic and kinetic descriptions that include any number of intermediates and transition states.
    PLoS Computational Biology 03/2013; 9(3):e1003003. DOI:10.1371/journal.pcbi.1003003 · 4.83 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many of the most important functions in the cell are carried out by proteins organized in large molecular machines. Cryo-electron microscopy (cryo-EM) is increasingly being used to obtain low resolution density maps of these large assemblies. A new method, ATTRACT-EM, for the computational assembly of molecular assemblies from their components has been developed. Based on concepts from the protein-protein docking field, it utilizes cryo-EM density maps to assemble molecular subunits at near atomic detail, starting from millions of initial subunit configurations. The search efficiency was further enhanced by recombining partial solutions, the inclusion of symmetry information, and refinement using a molecular force field. The approach was tested on the GroES-GroEL system, using an experimental cryo-EM map at 23.5 Å resolution, and on several smaller complexes. Inclusion of experimental information on the symmetry of the systems and the application of a new gradient vector matching algorithm allowed the efficient identification of docked assemblies in close agreement with experiment. Application to the GroES-GroEL complex resulted in a top ranked model with a deviation of 4.6 Å (and a 2.8 Å model within the top 10) from the GroES-GroEL crystal structure, a significant improvement over existing methods.
    PLoS ONE 12/2012; 7(12):e49733. DOI:10.1371/journal.pone.0049733 · 3.53 Impact Factor