Article

Alignment of RNA base pairing probability matrices.

Bioinformatics 01/2004; 20:2222-2227.
Source: DBLP
0 0
 · 
0 Bookmarks
 · 
40 Views
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We presentpartiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm’s complexity is polynomial in time and space. Algorithmically,partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments,partiFold-Align significantly outperforms state-of-the-art pairwise sequence alignment tools in the most difficult low sequence homology case and improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families. partiFold-Align is available at http://partiFold.csail.mit.edu.
    Srinivas Devadas. 01/2009;
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: Sequence-structure alignment of RNA with arbitrary secondary structure is Max-SNP-hard. Therefore, the problem of RNA alignment is commonly restricted to nested structure, where dynamic programming yields efficient solutions. However, nested structure cannot model pseudoknots or even more complex struc-tural dependencies. Nevertheless those dependencies are essential and conserved features of many RNAs. Only a few existing approaches deal with crossing structures. Here, we present a constraint approach for alignment of structures in the even more general class of unlimited structures. Our central contribution is a new RNA alignment constraint propagator. It is based on an efficient O(n 2) relaxation of the RNA alignment problem. Our constraint-based approach Carna solves the alignment problem for sequences with given input structures of unlimited complexity. Carna is implemented using Gecode. In the post-genomic era, biologists get more and more interested in studying non-coding RNA molecules with catalytic and regulatory activity as central players in biological systems. The computational analysis of non-coding RNA requires to take structural information into account. Whereas RNAs form three-dimensional structures, structural analysis of RNA is usually concerned with the secondary structure of an RNA, i.e. the set of RNA base pairs (i, j) that form contacts (H-bonds) between the bases i and j. The RNA alignment problem is to align two RNA sequences A and B with given secondary structure for each RNA such that a score based on sequence and structure similarity is optimized. The difficulty of this problem depends on the complexity of the RNA structures. Therefore, a complexity hierarchy of RNA structures was introduced. Most RNA analysis is performed for the class of nested structures P, where base-pairs do not cross, because for this class one can find efficient dynamic programming algorithms for structure prediction and alignment under reasonable scoring schemes [13, 6]. The more general class of crossing RNA structures P restricts the degree of base pairing to at most one, as is commonly assumed for single RNA structure. Prediction and alignment in this class is NP-hard in general [2]. However, one can devise a number of algorithms that efficiently predict or align RNAs with structures from classes in between non-crossing and arbitrary crossing [10, 9, 8]. However these algorithms have complexities that limit their application range. Other approaches for RNA alignment handle crossing structures with parametrized complexity, were the parameter captures the complexity of the structures [7]. Finally, the ILP approach Lara [1] computes alignments of arbitrarily complex crossing structures and appears to be more effective than dynamic programming based approaches. The success of this AI technique was a strong motivation for this work, where we study the alignment of RNAs with structures of unlimited complexity using constraint programming.
    01/2010;
  • Source
    [show abstract] [hide abstract]
    ABSTRACT: The present investigation includes in Silico sequence analysis, three-dimensional (3D) structure prediction and evolutionary profile of growth hormone (GH) from 14 ornamental freshwater fishes. The analyses were performed using the sequence data of growth hormone gene (gh) and its encoded GH protein. The evolutionary analyses were performed using maximum likelihood (ML) estimate and maximum parsimony (MP) methods. Bootstrap test (1000 replicates) was performed to validate the phylogenetic tree. The tertiary structures of GH were predicted using the comparative modelling method. The suitable template for comparative modeling protein databank (PDB IDs: 1HWG A) has been selected on the basis of basic local alignment search tool (BLASTp) and fast analysis (FASTA) results. The target-template alignment, model building, loop modelling and evaluation have been performed in Modeller 9.10. The tertiary structure of GH is α-helix structure connected by loops, which forms a compressed complex maintained by two disulfide bridges. The resultant 3D models are verified by ERRAT and ProCheck programmes. After fruitful verification, the tertiary structures of GH have been deposited to protein model database (PMDB). Sequence analyses and RNA secondary structure prediction was performed by CLC genomics workbench version 4.0. The computational models of GH could be of use for further evaluation of molecular mechanism of function
    AFRICAN JOURNAL OF BIOTECHNOLOGY 04/2012; Vol. 11(31):8005-8021,. · 0.57 Impact Factor