Local sequence-structure motifs in RNA.

Chair for Bioinformatics at the Institute of Computer Science, Friedrich-Schiller-Universitaet Jena, Ernst-Abbe-Platz 2, D-07743 Jena, Germany.
Journal of Bioinformatics and Computational Biology (Impact Factor: 0.93). 01/2005; 2(4):681-98. DOI: 10.1142/S0219720004000818
Source: PubMed

ABSTRACT Ribonuclic acid (RNA) enjoys increasing interest in molecular biology; despite this interest fundamental algorithms are lacking, e.g. for identifying local motifs. As proteins, RNA molecules have a distinctive structure. Therefore, in addition to sequence information, structure plays an important part in assessing the similarity of RNAs. Furthermore, common sequence-structure features in two or several RNA molecules are often only spatially local, where possibly large parts of the molecules are dissimilar. Consequently, we address the problem of comparing RNA molecules by computing an optimal local alignment with respect to sequence and structure information. While local alignment is superior to global alignment for identifying local similarities, no general local sequence-structure alignment algorithms are currently known. We suggest a new general definition of locality for sequence-structure alignments that is biologically motivated and efficiently tractable. To show the former, we discuss locality of RNA and prove that the defined locality means connectivity by atomic and non-atomic bonds. To show the latter, we present an efficient algorithm for the newly defined pairwise local sequence-structure alignment (lssa) problem for RNA. For molecules of lengthes n and m, the algorithm has worst-case time complexity of O(n2 x m2 x max(n,m)) and a space complexity of only O(n x m). An implementation of our algorithm is available at Its runtime is competitive with global sequence-structure alignment.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Starting with the discovery of microRNAs and the advent of genome-wide transcriptomics, non-protein-coding transcripts have moved from a fringe topic to a central field research in molecular biology. In this contribution we review the state of the art of "computational RNomics", i.e., the bioinformatics approaches to genome-wide RNA annotation. Instead of rehashing results from recently published surveys in detail, we focus here on the open problem in the field, namely (functional) annotation of the plethora of putative RNAs. A series of exploratory studies are used to provide non-trivial examples for the discussion of some of the difficulties.
    Journal of Experimental Zoology Part B Molecular and Developmental Evolution 02/2007; 308(1):1-25. · 2.12 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Novel algorithms for structural alignment of non-coding RNAs
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: BACKGROUND: The search for distant homologs has become an import issue in genome annotation. A particular difficulty is posed by divergent homologs that have lost recognizable sequence similarity. This same problem also arises in the recognition of novel members of large classes of RNAs such as snoRNAsor microRNAs that consist of families unrelated by common descent. Current homology search tools for structured RNAs are either based entirely on sequence similarity (such as blast or hmmer) or combine sequence and secondary structure. The most prominent example of the latter class of tools is Infernal. Alternatives are descriptor-based methods. In most practical applications published to-date, however, the information contained in covariance models or manually prescribed search patterns is dominated by sequence information. Here we ask two related questions: (1) Is secondary structure alone informative for homology search and the detection of novel members of RNA classes? (2) To what extent is the thermodynamic propensity of the target sequence to fold into the correct secondary structure helpful for this task? RESULTS: Sequence-structure alignment can be used as an alternative search strategy. In this scenario, the query consists of a base pairing probability matrix, which can be derived either from a single sequence or from a multiple alignment representing a set of known representatives. Sequence information can be optionally added to the query. The target sequence is pre-processed to obtain local base pairing probabilities. As a search engine we devised a semi-global scanning variant of LocARNA's algorithm for sequence-structure alignment. The LocARNAscan tool is optimized for speed and low memory consumption. In benchmarking experiments on artificial data we observe that the inclusion of thermodynamic stability is helpful, albeit only in a regime of extremely low sequence information in the query. We observe, furthermore, that the sensitivity is bounded in particular by the limited accuracy of the predicted local structures of the target sequence. CONCLUSIONS: Although we demonstrate that a purely structure-based homology search is feasible in principle, it is unlikely to outperform tools such as Infernal in most application scenarios, where a substantial amount of sequence information is typically available. The LocARNAscan approach will profit, however, from high throughput methods to determine RNA secondary structure. In transcriptomewide applications, such methods will provide accurate structure annotations on the target side. AVAILABILITY: Source code of the free software LocARNAscan 1.0 and supplementary data are available at
    Algorithms for Molecular Biology 04/2013; 8(1):14. · 1.61 Impact Factor


Available from