Article

Alignment of RNA base pairing probability matrices.

Institut für Theoretische Chemie und Molekulare Strukturbiologie, Universität Wien, Währingerstrasse 17, Vienna, Austria.
Bioinformatics (Impact Factor: 5.32). 10/2004; 20(14):2222-7. DOI: 10.1093/bioinformatics/bth229
Source: PubMed

ABSTRACT Many classes of functional RNA molecules are characterized by highly conserved secondary structures but little detectable sequence similarity. Reliable multiple alignments can therefore be constructed only when the shared structural features are taken into account. Since multiple alignments are used as input for many subsequent methods of data analysis, structure-based alignments are an indispensable necessity in RNA bioinformatics.
We present here a method to compute pairwise and progressive multiple alignments from the direct comparison of base pairing probability matrices. Instead of attempting to solve the folding and the alignment problem simultaneously as in the classical Sankoff's algorithm, we use McCaskill's approach to compute base pairing probability matrices which effectively incorporate the information on the energetics of each sequences. A novel, simplified variant of Sankoff's algorithms can then be employed to extract the maximum-weight common secondary structure and an associated alignment.
The programs pmcomp and pmmulti described in this contribution are implemented in Perl and can be downloaded together with the example datasets from http://www.tbi.univie.ac.at/RNA/PMcomp/. A web server is available at http://rna.tbi.univie.ac.at/cgi-bin/pmcgi.pl

0 Bookmarks
 · 
114 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Incorporating secondary structure information into the alignment process improves the quality of RNA sequence alignments. Instead of using fixed weighting parameters, sequence and structure components can be treated as different objectives and optimized simultaneously. The result is not a single, but a Pareto-set of equally optimal solutions which all represent different possible weighting parameters. We now provide the interactive graphical software tool RNA-Pareto which allows a direct inspection of all feasible results to the pairwise RNA sequence-structure alignment problem and greatly facilitates the exploration of the optimal solution set.Availability and Implementation: The software is written in Java 6 (graphical user interface) and C++ (dynamic programming algorithms). The source code and binaries for Linux, Windows and Mac OS are freely available at http://sysbio.uni-ulm.de and are licensed under the GNU GPLv3. hans.kestler@uni-ulm.de.
    Bioinformatics 09/2013; · 5.47 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Motivation: There is increasing evidence of pervasive transcription, resulting in hundreds of thousands of ncRNAs of unknown function. Standard computational analysis tasks for inferring functional annotations like clustering require fast and accurate RNA comparisons based on sequence and structure similarity. The gold standard for the latter is Sankoff's algorithm [3], which simultaneously aligns and folds RNAs. Because of its extreme time complexity of O(n6), numerous faster "Sankoff-style" approaches have been suggested. Several such approaches introduce heuristics based on sequence alignment, which compromises the alignment quality for RNAs with sequence identities below 60% [1]. Avoiding such heuristics, as e.g. in LocARNA [4], has been assumed to prohibit time complexities better than O(n4), which strongly limits large-scale applications.
    Proceedings of the 17th international conference on Research in Computational Molecular Biology; 04/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: ExpaRNA's core algorithm computes, for two fixed RNA structures, a maximal non-overlapping set of maximal exact matchings. We introduce an algorithm ExpaRNA-P that solves the lifted problem of finding such sets of exact matchings in entire Boltzmann-distributed structure ensembles of two RNAs. Due to a novel kind of structural sparsification, the new algorithm maintains the time and space complexity of the algorithm for fixed input structures. Furthermore, we generalized the chaining algorithm of ExpaRNA in order to compute a compatible subset of ExpaRNA-P's exact matchings. We show that ExpaRNA-P outperforms ExpaRNA in BRAliBase 2.1 benchmarks, where we pass the chained exact matchings as anchor constraints to the RNA alignment tool LocARNA. Compared to LocARNA, this novel approach shows similar accuracy but is six times faster.
    Proceedings of the 16th International Conference on Research in Computational Molecular Biology (RECOMB 2012); 01/2012

Full-text (3 Sources)

Download
37 Downloads
Available from
May 15, 2014