[Show abstract][Hide abstract] ABSTRACT: It has become apparent that much of cellular metabolism is controlled by large well-folded noncoding RNA molecules. In addition to crystallographic approaches, computational methods are needed for visualizing the 3D structure of large RNAs. Here, we modeled the molecular structure of the ai5γ group IIB intron from yeast using the crystal structure of a bacterial group IIC homolog. This was accomplished by adapting strategies for homology and de novo modeling, and creating a new computational tool for RNA refinement. The resulting model was validated experimentally using a combination of structure-guided mutagenesis and RNA structure probing. The model provides major insights into the mechanism and regulation of splicing, such as the position of the branch-site before and after the second step of splicing, and the location of subdomains that control target specificity, underscoring the feasibility of modeling large functional RNA molecules.
Nucleic Acids Research 11/2013; 42(3). DOI:10.1093/nar/gkt1051 · 8.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Structured RNA molecules are key players in ensuring cellular viability. It is now emerging that, like proteins, the functions of many nucleic acids are dictated by their tertiary folds. At the same time, the number of known crystal structures of nucleic acids is also increasing rapidly. In this context, molecular replacement will become an increasingly useful technique for phasing nucleic acid crystallographic data in the near future. Here, strategies to select, create and refine molecular-replacement search models for nucleic acids are discussed. Using examples taken primarily from research on group II introns, it is shown that nucleic acids are amenable to different and potentially more flexible and sophisticated molecular-replacement searches than proteins. These observations specifically aim to encourage future crystallographic studies on the newly discovered repertoire of noncoding transcripts.
[Show abstract][Hide abstract] ABSTRACT: RNA crystals typically diffract to much lower resolutions than protein crystals. This low-resolution diffraction results in unclear density maps, which cause considerable difficulties during the model-building process. These difficulties are exacerbated by the lack of computational tools for RNA modeling. Here, RCrane, a tool for the partially automated building of RNA into electron-density maps of low or intermediate resolution, is presented. This tool works within Coot, a common program for macromolecular model building. RCrane helps crystallographers to place phosphates and bases into electron density and then automatically predicts and builds the detailed all-atom structure of the traced nucleotides. RCrane then allows the crystallographer to review the newly built structure and select alternative backbone conformations where desired. This tool can also be used to automatically correct the backbone structure of previously built nucleotides. These automated corrections can fix incorrect sugar puckers, steric clashes and other structural problems.
[Show abstract][Hide abstract] ABSTRACT: Unlike proteins, the RNA backbone has numerous degrees of freedom (eight, if one counts the sugar pucker), making RNA modeling, structure building and prediction a multidimensional problem of exceptionally high complexity. And yet RNA tertiary structures are not infinite in their structural morphology; rather, they are built from a limited set of discrete units. In order to reduce the dimensionality of the RNA backbone in a physically reasonable way, a shorthand notation was created that reduced the RNA backbone torsion angles to two (η and θ, analogous to φ and ψ in proteins). When these torsion angles are calculated for nucleotides in a crystallographic database and plotted against one another, one obtains a plot analogous to a Ramachandran plot (the η/θ plot), with highly populated and unpopulated regions. Nucleotides that occupy proximal positions on the plot have identical structures and are found in the same units of tertiary structure. In this review, we describe the statistical validation of the η/θ formalism and the exploration of features within the η/θ plot. We also describe the application of the η/θ formalism in RNA motif discovery, structural comparison, RNA structure building and tertiary structure prediction. More than a tool, however, the η/θ formalism has provided new insights into RNA structure itself, revealing its fundamental components and the factors underlying RNA architectural form.
[Show abstract][Hide abstract] ABSTRACT: Structured RNA molecules play essential roles in a variety of cellular processes; however, crystallographic studies of such RNA molecules present a large number of challenges. One notable complication arises from the low resolutions typical of RNA crystallography, which results in electron density maps that are imprecise and difficult to interpret. This problem is exacerbated by the lack of computational tools for RNA modeling, as many of the techniques commonly used in protein crystallography have no equivalents for RNA structure. This leads to difficulty and errors in the model building process, particularly in modeling of the RNA backbone, which is highly error prone due to the large number of variable torsion angles per nucleotide. To address this, we have developed a method for accurately building the RNA backbone into maps of intermediate or low resolution. This method is semiautomated, as it requires a crystallographer to first locate phosphates and bases in the electron density map. After this initial trace of the molecule, however, an accurate backbone structure can be built without further user intervention. To accomplish this, backbone conformers are first predicted using RNA pseudotorsions and the base-phosphate perpendicular distance. Detailed backbone coordinates are then calculated to conform both to the predicted conformer and to the previously located phosphates and bases. This technique is shown to produce accurate backbone structure even when starting from imprecise phosphate and base coordinates. A program implementing this methodology is currently available, and a plugin for the Coot model building program is under development.
Proceedings of the National Academy of Sciences 05/2010; 107(18):8177-82. DOI:10.1073/pnas.0911888107 · 9.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Group II introns are large ribozymes that act as self-splicing and retrotransposable RNA molecules. They are of great interest because of their potential evolutionary relationship to the eukaryotic spliceosome, their continued influence on the organization of many genomes in bacteria and eukaryotes, and their potential utility as tools for gene therapy and biotechnology. One of the most interesting features of group II introns is their relative lack of nucleobase conservation and covariation, which has long suggested that group II intron structures are stabilized by numerous unusual tertiary interactions and backbone-mediated contacts. Here, we provide a detailed description of the tertiary interaction networks within the Oceanobacillus iheyensis group IIC intron, for which a crystal structure was recently solved to 3.1 A resolution. The structure can be described as a set of several intricately constructed tertiary interaction nodes, each of which contains a core of extended stacking networks and elaborate motifs. Many of these nodes are surrounded by a web of ribose zippers, which appear to further stabilize local structure. As predicted from biochemical and genetic studies, the group II intron provides a wealth of new information on strategies for RNA folding and tertiary structural organization.
[Show abstract][Hide abstract] ABSTRACT: Group II introns are self-splicing, mobile genetic elements that have fundamentally influenced the organization of terrestrial genomes. These large ribozymes remain important for gene expression in almost all forms of bacteria and eukaryotes and they are believed to share a common ancestry with the eukaryotic spliceosome that is required for processing all nuclear pre-mRNAs. The three-dimensional structure of a group IIC intron was recently determined by X-ray crystallography, making it possible to visualize the active site and the elaborate network of tertiary interactions that stabilize the molecule. Here we describe the molecular features of the active site in detail and evaluate their correspondence with prior biochemical, genetic, and phylogenetic analyses on group II introns. In addition, we evaluate the structural significance of RNA motifs within the intron core, such as the major-groove triple helix and the domain 5 bulge. Having combined what is known about the group II intron core, we then compare it with known structural features of U6 snRNA in the eukaryotic spliceosome. This analysis leads to a set of predictions for the molecular structure of the spliceosomal active site.
[Show abstract][Hide abstract] ABSTRACT: Intron splicing is a fundamental biological process whereby noncoding sequences are removed from precursor RNAs. Recent work has provided new insights into the structural features and reaction mechanisms of two introns that catalyze their own splicing from precursor RNA: the group I and II introns. In addition, there is an increasing amount of structural information on the spliceosome, which is a ribonucleoprotein machine that catalyzes nuclear pre-mRNA splicing in eukaryotes. Here, we compare structures and catalytic mechanisms of self-splicing RNAs and we discuss the possible implications for spliceosomal reaction mechanisms.
Current Opinion in Structural Biology 06/2009; 19(3):260-6. DOI:10.1016/j.sbi.2009.04.002 · 8.75 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Hinge motions are important for molecular recognition, and knowledge of their location can guide the sampling of protein conformations for docking. Predicting domains and intervening hinges is also important for identifying structurally self-determinate units and anticipating the influence of mutations on protein flexibility and stability. Here we present StoneHinge, a novel approach for predicting hinges between domains using input from two complementary analyses of noncovalent bond networks: StoneHingeP, which identifies domain-hinge-domain signatures in ProFlex constraint counting results, and StoneHingeD, which does the same for DomDecomp Gaussian network analyses. Predictions for the two methods are compared to hinges defined in the literature and by visual inspection of interpolated motions between conformations in a series of proteins. For StoneHingeP, all the predicted hinges agree with hinge sites reported in the literature or observed visually, although some predictions include extra residues. Furthermore, no hinges are predicted in six hinge-free proteins. On the other hand, StoneHingeD tends to overpredict the number of hinges, while accurately pinpointing hinge locations. By determining the consensus of their results, StoneHinge improves the specificity, predicting 11 of 13 hinges found both visually and in the literature for nine different open protein structures, and making no false-positive predictions. By comparison, a popular hinge detection method that requires knowledge of both the open and closed conformations finds 10 of the 13 known hinges, while predicting four additional, false hinges.
Protein Science 02/2009; 18(2):359-71. DOI:10.1002/pro.38 · 2.86 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Protein motion is often the link between structure and function and a substantial fraction of proteins move through a domain hinge bending mechanism. Predicting the location of the hinge from a single structure is thus a logical first step towards predicting motion. Here, we describe ways to predict the hinge location by grouping residues with correlated normal-mode motions. We benchmarked our normal-mode based predictor against a gold standard set of carefully annotated hinge locations taken from the Database of Macromolecular Motions. We then compared it with three existing structure-based hinge predictors (TLSMD, StoneHinge, and FlexOracle), plus HingeSeq, a sequence-based hinge predictor. Each of these methods predicts hinges using very different sources of information-normal modes, experimental thermal factors, bond constraint networks, energetics, and sequence, respectively. Thus it is logical that using these algorithms together would improve predictions. We integrated all the methods into a combined predictor using a weighted voting scheme. Finally, we encapsulated all our results in a web tool which can be used to run all the predictors on submitted proteins and visualize the results.
Proteins Structure Function and Bioinformatics 11/2008; 73(2):299-319. DOI:10.1002/prot.22060 · 2.92 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Free group II introns are infectious retroelements that can bind and insert themselves into RNA and DNA molecules via reverse splicing. Here we report the 3.4-A crystal structure of a complex between an oligonucleotide target substrate and a group IIC intron, as well as the refined free intron structure. The structure of the complex reveals the conformation of motifs involved in exon recognition by group II introns.
[Show abstract][Hide abstract] ABSTRACT: Tetraloops are a common building block for RNA tertiary structure, and most tetraloops fall into one of three well-characterized classes: GNRA, UNCG, and CUYG. Here, we present the sequence and structure of a fourth highly conserved class of tetraloop that occurs only within the zeta-zeta' interaction of group IIC introns. This GANC tetraloop was identified, along with an unusual cognate receptor, in the crystal structure of the group IIC intron and through phylogenetic analysis of intron RNA sequence alignments. Unlike conventional tetraloop-receptor interactions, which are stabilized by extensive hydrogen-bonding interactions, the GANC-receptor interaction is limited to a single base stack between the conserved adenosine of the tetraloop and a single purine of the receptor, which consists of a one- to three-nucleotide bulge and does not contain an A-platform. Unlike GNRA tetraloops, the GANC tetraloop forms a sharp angle relative to the adjacent helix, bending by approximately 45 degrees toward the major groove side of the helix. These structural attributes allow GANC tetraloops to fit precisely within the group IIC intron core, thereby demonstrating that structural motifs can adapt to function in a specific niche.
[Show abstract][Hide abstract] ABSTRACT: Group II introns are self-splicing ribozymes that catalyze their own excision from precursor transcripts and insertion into new genetic locations. Here we report the crystal structure of an intact, self-spliced group II intron from Oceanobacillus iheyensis at 3.1 angstrom resolution. An extensive network of tertiary interactions facilitates the ordered packing of intron subdomains around a ribozyme core that includes catalytic domain V. The bulge of domain V adopts an unusual helical structure that is located adjacent to a major groove triple helix (catalytic triplex). The bulge and catalytic triplex jointly coordinate two divalent metal ions in a configuration that is consistent with a two-metal ion mechanism for catalysis. Structural and functional analogies support the hypothesis that group II introns and the spliceosome share a common ancestor.
[Show abstract][Hide abstract] ABSTRACT: A consensus classification and nomenclature are defined for RNA backbone structure using all of the backbone torsion angles. By a consensus of several independent analysis methods, 46 discrete conformers are identified as suitably clustered in a quality-filtered, multidimensional dihedral angle distribution. Most of these conformers represent identifiable features or roles within RNA structures. The conformers are given two-character names that reflect the seven-angle delta epsilon zeta alpha beta gamma delta combinations empirically found favorable for the sugar-to-sugar "suite" unit within which the angle correlations are strongest (e.g., 1a for A-form, 5z for the start of S-motifs). Since the half-nucleotides are specified by a number for delta epsilon zeta and a lowercase letter for alpha beta gamma delta, this modular system can also be parsed to describe traditional nucleotide units (e.g., a1) or the dinucleotides (e.g., a1a1) that are especially useful at the level of crystallographic map fitting. This nomenclature can also be written as a string with two-character suite names between the uppercase letters of the base sequence (N1aG1gN1aR1aA1cN1a for a GNRA tetraloop), facilitating bioinformatic comparisons. Cluster means, standard deviations, coordinates, and examples are made available, as well as the Suitename software that assigns suite conformer names and conformer match quality (suiteness) from atomic coordinates. The RNA Ontology Consortium will combine this new backbone system with others that define base pairs, base-stacking, and hydrogen-bond relationships to provide a full description of RNA structural motifs.
[Show abstract][Hide abstract] ABSTRACT: Quantitatively describing RNA structure and conformational elements remains a formidable problem. Seven standard torsion angles and the sugar pucker are necessary to characterize the conformation of an RNA nucleotide completely. Progress has been made toward understanding the discrete nature of RNA structure, but classifying simple and ubiquitous structural elements such as helices and motifs remains a difficult task. One approach for describing RNA structure in a simple, mathematically consistent, and computationally accessible manner involves the invocation of two pseudotorsions, eta (C4'(n-1), P(n), C4'(n), P(n+1)) and theta (P(n), C4'(n), P(n+1), C4'(n+1)), which can be used to describe RNA conformation in much the same way that varphi and psi are used to describe backbone configuration of proteins. Here, we conduct an exploration and statistical evaluation of pseudotorsional space and of the Ramachandran-like eta-theta plot. We show that, through the rigorous quantitative analysis of the eta-theta plot, the pseudotorsional descriptors eta and theta, together with sugar pucker, are sufficient to describe RNA backbone conformation fully in most cases. These descriptors are also shown to contain considerable information about nucleotide base conformation, revealing a previously uncharacterized interplay between backbone and base orientation. A window function analysis is used to discern statistically relevant regions of density in the eta-theta scatter plot and then nucleotides in colocalized clusters in the eta-theta plane are shown to have similar 3-D structures through RMSD analysis of the RNA structural constituents. We find that major clusters in the eta-theta plot are few, underscoring the discrete nature of RNA backbone conformation. Like the Ramachandran plot, the eta-theta plot is a valuable system for conceptualizing biomolecular conformation, it is a useful tool for analyzing RNA tertiary structures, and it is a vital component of new approaches for solving the 3-D structures of large RNA molecules and RNA assemblies.
[Show abstract][Hide abstract] ABSTRACT: The database of molecular motions, MolMovDB (http://molmovdb.org), has been in existence for the past decade. It classifies macromolecular motions and provides tools to interpolate between two conformations (the Morph Server) and predict possible motions in a single structure. In 2005, we expanded the services offered on MolMovDB. In particular, we further developed the Morph Server to produce improved interpolations between two submitted structures. We added support for multiple chains to the original adiabatic mapping interpolation, allowing the analysis of subunit motions. We also added the option of using FRODA interpolation, which allows for more complex pathways, potentially overcoming steric barriers. We added an interface to a hinge prediction service, which acts on single structures and predicts likely residue points for flexibility. We developed tools to relate such points of flexibility in a structure to particular key residue positions, i.e. active sites or highly conserved positions. Lastly, we began relating our motion classification scheme to function using descriptions from the Gene Ontology Consortium.