[show abstract][hide abstract] ABSTRACT: The post-transcriptional fate of messenger RNAs (mRNAs) is largely dictated by their 3' untranslated regions (3' UTRs), which are defined by cleavage and polyadenylation (CPA) of pre-mRNAs. We used poly(A)-position profiling by sequencing (3P-seq) to map poly(A) sites at eight developmental stages and tissues in the zebrafish. Analysis of over 60 million 3P-seq reads substantially increased and improved existing 3' UTR annotations, resulting in confidently identified 3' UTRs for >79% of the annotated protein-coding genes in zebrafish. mRNAs from most zebrafish genes undergo alternative CPA, with those from more than a thousand genes using different dominant 3' UTRs at different stages. These included one of the poly(A) polymerase genes, for which alternative CPA reinforces its repression in the ovary. 3' UTRs tend to be shortest in the ovaries and longest in the brain. Isoforms with some of the shortest 3' UTRs are highly expressed in the ovary, yet absent in the maternally contributed RNAs of the embryo, perhaps because their 3' UTRs are too short to accommodate a uridine-rich motif required for stability of the maternal mRNA. At 2 h post-fertilization, thousands of unique poly(A) sites appear at locations lacking a typical polyadenylation signal, which suggests a wave of widespread cytoplasmic polyadenylation of mRNA degradation intermediates. Our insights into the identities, formation, and evolution of zebrafish 3' UTRs provide a resource for studying gene regulation during vertebrate development.
Genome Research 06/2012; 22(10):2054-66. · 14.40 Impact Factor
[show abstract][hide abstract] ABSTRACT: Thousands of long intervening noncoding RNAs (lincRNAs) have been identified in mammals. To better understand the evolution and functions of these enigmatic RNAs, we used chromatin marks, poly(A)-site mapping and RNA-Seq data to identify more than 550 distinct lincRNAs in zebrafish. Although these shared many characteristics with mammalian lincRNAs, only 29 had detectable sequence similarity with putative mammalian orthologs, typically restricted to a single short region of high conservation. Other lincRNAs had conserved genomic locations without detectable sequence conservation. Antisense reagents targeting conserved regions of two zebrafish lincRNAs caused developmental defects. Reagents targeting splice sites caused the same defects and were rescued by adding either the mature lincRNA or its human or mouse ortholog. Our study provides a roadmap for identification and analysis of lincRNAs in model organisms and shows that lincRNAs play crucial biological roles during embryonic development with functionality conserved despite limited sequence conservation.
[show abstract][hide abstract] ABSTRACT: Post-transcriptional gene regulation frequently occurs through elements in mRNA 3' untranslated regions (UTRs). Although crucial roles for 3'UTR-mediated gene regulation have been found in Caenorhabditis elegans, most C. elegans genes have lacked annotated 3'UTRs. Here we describe a high-throughput method for reliable identification of polyadenylated RNA termini, and we apply this method, called poly(A)-position profiling by sequencing (3P-Seq), to determine C. elegans 3'UTRs. Compared to standard methods also recently applied to C. elegans UTRs, 3P-Seq identified 8,580 additional UTRs while excluding thousands of shorter UTR isoforms that do not seem to be authentic. Analysis of this expanded and corrected data set suggested that the high A/U content of C. elegans 3'UTRs facilitated genome compaction, because the elements specifying cleavage and polyadenylation, which are A/U rich, can more readily emerge in A/U-rich regions. Indeed, 30% of the protein-coding genes have mRNAs with alternative, partially overlapping end regions that generate another 10,480 cleavage and polyadenylation sites that had gone largely unnoticed and represent potential evolutionary intermediates of progressive UTR shortening. Moreover, a third of the convergently transcribed genes use palindromic arrangements of bidirectional elements to specify UTRs with convergent overlap, which also contributes to genome compaction by eliminating regions between genes. Although nematode 3'UTRs have median length only one-sixth that of mammalian 3'UTRs, they have twice the density of conserved microRNA sites, in part because additional types of seed-complementary sites are preferentially conserved. These findings reveal the influence of cleavage and polyadenylation on the evolution of genome architecture and provide resources for studying post-transcriptional gene regulation.
[show abstract][hide abstract] ABSTRACT: MicroRNAs (miRNAs) are approximately 22-nucleotide RNAs that are processed from characteristic precursor hairpins and pair to sites in messages of protein-coding genes to direct post-transcriptional repression. Here, we report that the miRNA iab-4 locus in the Drosophila Hox cluster is transcribed convergently from both DNA strands, giving rise to two distinct functional miRNAs. Both sense and antisense miRNA products target neighboring Hox genes via highly conserved sites, leading to homeotic transformations when ectopically expressed. We also report sense/antisense miRNAs in mouse and find antisense transcripts close to many miRNAs in both flies and mammals, suggesting that additional sense/antisense pairs exist.
Genes & Development 02/2008; 22(1):8-13. · 12.44 Impact Factor
[show abstract][hide abstract] ABSTRACT: MicroRNAs (miRNAs) are approximately 22-nucleotide endogenous RNAs that often repress the expression of complementary messenger RNAs. In animals, miRNAs derive from characteristic hairpins in primary transcripts through two sequential RNase III-mediated cleavages; Drosha cleaves near the base of the stem to liberate a approximately 60-nucleotide pre-miRNA hairpin, then Dicer cleaves near the loop to generate a miRNA:miRNA* duplex. From that duplex, the mature miRNA is incorporated into the silencing complex. Here we identify an alternative pathway for miRNA biogenesis, in which certain debranched introns mimic the structural features of pre-miRNAs to enter the miRNA-processing pathway without Drosha-mediated cleavage. We call these pre-miRNAs/introns 'mirtrons', and have identified 14 mirtrons in Drosophila melanogaster and another four in Caenorhabditis elegans (including the reclassification of mir-62). Some of these have been selectively maintained during evolution with patterns of sequence conservation suggesting important regulatory functions in the animal. The abundance of introns comparable in size to pre-miRNAs appears to have created a context favourable for the emergence of mirtrons in flies and nematodes. This suggests that other lineages with many similarly sized introns probably also have mirtrons, and that the mirtron pathway could have provided an early avenue for the emergence of miRNAs before the advent of Drosha.
[show abstract][hide abstract] ABSTRACT: We sequenced approximately 400,000 small RNAs from Caenorhabditis elegans. Another 18 microRNA (miRNA) genes were identified, thereby extending to 112 our tally of confidently identified miRNA genes in C. elegans. Also observed were thousands of endogenous siRNAs generated by RNA-directed RNA polymerases acting preferentially on transcripts associated with spermatogenesis and transposons. In addition, a third class of nematode small RNAs, called 21U-RNAs, was discovered. 21U-RNAs are precisely 21 nucleotides long, begin with a uridine 5'-monophosphate but are diverse in their remaining 20 nucleotides, and appear modified at their 3'-terminal ribose. 21U-RNAs originate from more than 5700 genomic loci dispersed in two broad regions of chromosome IV-primarily between protein-coding genes or within their introns. These loci share a large upstream motif that enables accurate prediction of additional 21U-RNAs. The motif is conserved in other nematodes, presumably because of its importance for producing these diverse, autonomously expressed, small RNAs (dasRNAs).
[show abstract][hide abstract] ABSTRACT: In Arabidopsis, microRNA-directed cleavage can define one end of RNAs that then generate phased siRNAs. However, most miRNA-targeted RNAs do not spawn siRNAs, suggesting the existence of additional determinants within those that do. We find that in moss, phased siRNAs arise from regions flanked by dual miR390 cleavage sites. AtTAS3, an siRNA locus important for development and conserved among higher plants, also has dual miR390 complementary sites. Both sites bind miR390 in vitro and are functionally required in Arabidopsis, but cleavage is undetectable at the 5' site--demonstrating that noncleavable sites can be functional in plants. Phased siRNAs also emanate from the bounded regions of every Arabidopsis gene with two known microRNA/siRNA complementary sites, but only rarely from genes with single sites. Therefore, two "hits,"--often, but not always, two cleavage events--constitute a conserved trigger for siRNA biogenesis, a finding with implications for recognition and silencing of aberrant RNA.
[show abstract][hide abstract] ABSTRACT: Thousands of mammalian messenger RNAs are under selective pressure to maintain 7-nucleotide sites matching microRNAs (miRNAs). We found that these conserved targets are often highly expressed at developmental stages before miRNA expression and that their levels tend to fall as the miRNA that targets them begins to accumulate. Nonconserved sites, which outnumber the conserved sites 10 to 1, also mediate repression. As a consequence, genes preferentially expressed at the same time and place as a miRNA have evolved to selectively avoid sites matching the miRNA. This phenomenon of selective avoidance extends to thousands of genes and enables spatial and temporal specificities of miRNAs to be revealed by finding tissues and developmental stages in which messages with corresponding sites are expressed at lower levels.