[show abstract][hide abstract] ABSTRACT: Current advances of the next-generation sequencing technology have revealed a large number of un-annotated RNA transcripts. Comparative study of the RNA structurome is an important approach to assess their biological functionalities. Due to the large sizes and abundance of the RNA transcripts, an efficient and accurate RNA structure-structure alignment algorithm is in urgent need to facilitate the comparative study. Despite the importance of the RNA secondary structure alignment problem, there are no computational tools available that provide high computational efficiency and accuracy. In this case, designing and implementing such an efficient and accurate RNA secondary structure alignment algorithm is highly desirable.
In this work, through incorporating the sparse dynamic programming technique, we implemented an algorithm that has an O(n3) expected time complexity, where n is the average number of base pairs in the RNA structures. This complexity, which can be shown assuming the polymer-zeta property, is confirmed by our experiments. The resulting new RNA secondary structure alignment tool is called ERA. Benchmark results indicate that ERA can significantly speedup RNA structure-structure alignments compared to other state-of-the-art RNA alignment tools, while maintaining high alignment accuracy.
Using the sparse dynamic programming technique, we are able to develop a new RNA secondary structure alignment tool that is both efficient and accurate. We anticipate that the new alignment algorithm ERA will significantly promote comparative RNA structure studies. The program, ERA, is freely available at http://genome.ucf.edu/ERA.
[show abstract][hide abstract] ABSTRACT: Metastatic melanoma is a malignant cancer with generally poor prognosis, with no targeted chemotherapy. To identify epigenetic changes related to melanoma, we have determined genome-wide methylated CpG island distributions by next-generation sequencing. Melanoma chromosomes tend to be differentially methylated over short CpG island tracts. CpG islands in the upstream regulatory regions of many coding and noncoding RNA genes, including, for example, TERC, which encodes the telomerase RNA, exhibit extensive hypermethylation, whereas several repeated elements, such as LINE 2, and several LTR elements, are hypomethylated in advanced stage melanoma cell lines. By using CpG island demethylation profiles, and by integrating these data with RNA-seq data obtained from melanoma cells, we have identified a co-expression network of differentially methylated genes with significance for cancer related functions. Focused assays of melanoma patient tissue samples for CpG island methylation near the noncoding RNA gene SNORD-10 demonstrated high specificity.
[show abstract][hide abstract] ABSTRACT: RNA structural motifs are the building blocks of the complex RNA architecture. Identification of non-coding RNA structural motifs is a critical step towards understanding of their structures and functionalities. In this article, we present a clustering approach for de novo RNA structural motif identification. We applied our approach on a data set containing 5S, 16S and 23S rRNAs and rediscovered many known motifs including GNRA tetraloop, kink-turn, C-loop, sarcin-ricin, reverse kink-turn, hook-turn, E-loop and tandem-sheared motifs, with higher accuracy than the state-of-the-art clustering method. We also identified a number of potential novel instances of GNRA tetraloop, kink-turn, sarcin-ricin and tandem-sheared motifs. More importantly, several novel structural motif families have been revealed by our clustering analysis. We identified a highly asymmetric bulge loop motif that resembles the rope sling. We also found an internal loop motif that can significantly increase the twist of the helix. Finally, we discovered a subfamily of hexaloop motif, which has significantly different geometry comparing to the currently known hexaloop motif. Our discoveries presented in this article have largely increased current knowledge of RNA structural motifs.
Nucleic Acids Research 02/2012; 40(3):1307-17. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: The non-coding RNA (ncRNA) elements in the 3′ untranslated regions (3′-UTRs) are known to participate in the genes' post-transcriptional regulation, such as their stability, translation efficiency, and subcellular localization. Inferring co-expression patterns of the genes by clustering their 3′-UTR ncRNA elements will provide invaluable knowledge for further studies of their functionalities and interactions under specific physiological processes. In this work, we propose an improved RNA structural clustering pipeline that takes into account the length-dependent distribution of the structural similarity measure. Benchmark of the proposed pipeline on Rfam data clearly demonstrates over 10% performance gain, compared to a traditional hierarchical clustering pipeline. By applying the proposed clustering pipeline to Drosophila melanogaster's 3′-UTRs, we have successfully identified 184 ncRNA clusters, of which 91.3% appear to be true RNA structural elements, based on RNAz's prediction. Among the clusters we have rediscovered the well-known histone ncRNA family as well as a number of other families whose potential functionalities may be inferred from existing studies. One of such families contains genes that are preferentially expressed in male Drosophila. In situ hybridization further reveals their characteristic ‘cup’ or ‘comet’ localization patterns in Drosophila testis. The complete clustering results are available at http://genome.ucf.edu/fly3UTRcluster.
[show abstract][hide abstract] ABSTRACT: Invasive melanoma is the most lethal form of skin cancer. The treatment of melanoma-derived cell lines with 5-aza-2'-deoxycytidine (5-Aza-dC) markedly increases the expression of several miRNAs, suggesting that the miRNA-encoding genes might be epigenetically regulated, either directly or indirectly, by DNA methylation. We have identified a group of epigenetically regulated miRNA genes in melanoma cells, and have confirmed that the upstream CpG island sequences of several such miRNA genes are hypermethylated in cell lines derived from different stages of melanoma, but not in melanocytes and keratinocytes. We used direct DNA bisulfite and immunoprecipitated DNA (Methyl-DIP) to identify changes in CpG island methylation in distinct melanoma patient samples classified as primary in situ, regional metastatic, and distant metastatic. Two melanoma cell lines (WM1552C and A375 derived from stage 3 and stage 4 human melanoma, respectively) were engineered to ectopically express one of the epigenetically modified miRNA: miR-34b. Expression of miR-34b reduced cell invasion and motility rates of both WM1552C and A375, suggesting that the enhanced cell invasiveness and motility observed in metastatic melanoma cells may be related to their reduced expression of miR-34b. Total RNA isolated from control or miR-34b-expressing WM1552C cells was subjected to deep sequencing to identify gene networks around miR-34b. We identified network modules that are potentially regulated by miR-34b, and which suggest a mechanism for the role of miR-34b in regulating normal cell motility and cytokinesis.
PLoS ONE 01/2011; 6(9):e24922. · 3.73 Impact Factor
[show abstract][hide abstract] ABSTRACT: Recent studies have shown that RNA structural motifs play essential roles in RNA folding and interaction with other molecules. Computational identification and analysis of RNA structural motifs remains a challenging task. Existing motif identification methods based on 3D structure may not properly compare motifs with high structural variations. Other structural motif identification methods consider only nested canonical base-pairing structures and cannot be used to identify complex RNA structural motifs that often consist of various non-canonical base pairs due to uncommon hydrogen bond interactions. In this article, we present a novel RNA structural alignment method for RNA structural motif identification, RNAMotifScan, which takes into consideration the isosteric (both canonical and non-canonical) base pairs and multi-pairings in RNA structural motifs. The utility and accuracy of RNAMotifScan is demonstrated by searching for kink-turn, C-loop, sarcin-ricin, reverse kink-turn and E-loop motifs against a 23S rRNA (PDBid: 1S72), which is well characterized for the occurrences of these motifs. Finally, we search these motifs against the RNA structures in the entire Protein Data Bank and the abundances of them are estimated. RNAMotifScan is freely available at our supplementary website (http://genome.ucf.edu/RNAMotifScan).
Nucleic Acids Research 10/2010; 38(18):e176. · 8.28 Impact Factor