[Show abstract][Hide abstract] ABSTRACT: Protein kinase RNA-activated (PKR) has long been known to be activated by viral double-stranded RNA (dsRNA) as part of the mammalian immune response. However, in mice PKR is also activated by metabolic stress in the absence of viral infection, and this requires a functional kinase domain, as well as a functional dsRNA-binding domain. The endogenous cellular RNA that potentially leads to PKR activation during metabolic stress is unknown. We investigated this question using mouse embryonic fibroblast cells expressing wild-type PKR (PKRWT) or PKR with a point mutation in each dsRNA-binding motif (PKRRM). Using this system, we identified endogenous RNA that interacts with PKR after induction of metabolic stress by palmitic acid (PA) treatment. Specifically, RIP-Seq analyses showed that the majority of enriched RNAs that interacted with WT PKR (≥twofold, false discovery rate ≤ 5%) were small nucleolar RNAs (snoRNAs). Immunoprecipitation of PKR in extracts of UV-cross-linked cells, followed by RT-qPCR, confirmed that snoRNAs were enriched in PKRWT samples after PA treatment, but not in the PKRRM samples. We also demonstrated that a subset of identified snoRNAs bind and activate PKR in vitro; the presence of a 5'-triphosphate enhanced PKR activity compared with the activity with a 5'-monophosphate, for some, but not all, snoRNAs. Finally, we demonstrated PKR activation in cells upon snoRNA transfection, supporting our hypothesis that endogenous snoRNAs can activate PKR. Our results suggest an unprecedented and unexpected model whereby snoRNAs play a role in the activation of PKR under metabolic stress.
Proceedings of the National Academy of Sciences 04/2015; 112(16):201424044. DOI:10.1073/pnas.1424044112 · 9.67 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Early vertebrate embryos must achieve totipotency and prepare for zygotic genome activation (ZGA). To understand this process, we determined the DNA methylation (DNAme) profiles of zebrafish gametes, embryos at different stages, and somatic muscle and compared them to gene activity and histone modifications. Sperm chromatin patterns are virtually identical to those at ZGA. Unexpectedly, the DNA of many oocyte genes important for germline functions (i.e., piwil1) or early development (i.e., hox genes) is methylated, but the loci are demethylated during zygotic cleavage stages to precisely the state observed in sperm, even in parthenogenetic embryos lacking a replicating paternal genome. Furthermore, this cohort constitutes the genes and loci that acquire DNAme during development (i.e., ZGA to muscle). Finally, DNA methyltransferase inhibition experiments suggest that DNAme silences particular gene and chromatin cohorts at ZGA, preventing their precocious expression. Thus, zebrafish achieve a totipotent chromatin state at ZGA through paternal genome competency and maternal genome DNAme reprogramming.
[Show abstract][Hide abstract] ABSTRACT: The sperm chromatin of fertile men retains a small number of nucleosomes that are enriched at developmental gene promoters and imprinted gene loci. This unique chromatin packaging at certain gene promoters provides these genomic loci the ability to convey instructive epigenetic information to the zygote, potentially expanding the role and significance of the sperm epigenome in embryogenesis. We hypothesize that changes in chromatin packaging may be associated with poor reproductive outcome.
Seven patients with reproductive dysfunction were recruited: three had unexplained poor embryogenesis during IVF and four were diagnosed with male infertility and previously shown to have altered protamination. Genome-wide analysis of the location of histones and histone modifications was analyzed by isolation and purification of DNA bound to histones and protamines. The histone-bound fraction of DNA was analyzed using high-throughput sequencing, both initially and following chromatin immunoprecipitation. The protamine-bound fraction was hybridized to agilent arrays. DNA methylation was examined using bisulfite sequencing.
Unlike fertile men, five of seven infertile men had non-programmatic (randomly distributed) histone retention genome-wide. Interestingly, in contrast to the total histone pool, the localization of H3 Lysine 4 methylation (H3K4me) or H3 Lysine 27 methylation (H3K27me) was highly similar in the gametes of infertile men compared with fertile men. However, there was a reduction in the amount of H3K4me or H3K27me retained at developmental transcription factors and certain imprinted genes. Finally, the methylation status of candidate developmental promoters and imprinted loci were altered in a subset of the infertile men.
This initial genome-wide analysis of epigenetic markings in the sperm of infertile men demonstrates differences in composition and epigenetic markings compared with fertile men, especially at certain imprinted and developmental loci. Although no single locus displays a complete change in chromatin packaging or DNA modification, the data suggest that moderate changes throughout the genome exist and may have a cumulative detrimental effect on fecundity.
Human Reproduction 06/2011; 26(9):2558-69. DOI:10.1093/humrep/der192 · 4.57 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Inbred mice are a useful tool for studying the in vivo functions of platelets. Nonetheless, the mRNA signature of mouse platelets is not known. Here, we use paired-end next-generation RNA sequencing (RNA-seq) to characterize the polyadenylated transcriptomes of human and mouse platelets. We report that RNA-seq provides unprecedented resolution of mRNAs that are expressed across the entire human and mouse genomes. Transcript expression and abundance are often conserved between the 2 species. Several mRNAs, however, are differentially expressed in human and mouse platelets. Moreover, previously described functional disparities between mouse and human platelets are reflected in differences at the transcript level, including protease activated receptor-1, protease activated receptor-3, platelet activating factor receptor, and factor V. This suggests that RNA-seq is a useful tool for predicting differences in platelet function between mice and humans. Our next-generation sequencing analysis provides new insights into the human and murine platelet transcriptomes. The sequencing dataset will be useful in the design of mouse models of hemostasis and a catalyst for discovery of new functions of platelets. Access to the dataset is found in the "Introduction."
[Show abstract][Hide abstract] ABSTRACT: Microarray studies of chronic hepatitis C infection have provided valuable information regarding the host response to viral infection. However, recent studies of the human transcriptome indicate pervasive transcription in previously unannotated regions of the genome and that many RNA transcripts have short or lack 3' poly(A) ends. We hypothesized that using ENCODE tiling arrays (1% of the genome) in combination with affinity purifying Pol II RNAs by their unique 5' m⁷GpppN cap would identify previously undescribed annotated and unannotated genes that are differentially expressed in liver during hepatitis C virus (HCV) infection. Both 5'-capped and poly(A)+ populations of RNA were analyzed using ENCODE tiling arrays. Sixty-four annotated genes were significantly increased in HCV cirrhotic as compared to control liver; twenty-seven (42%) of these genes were identified only by analyzing 5' capped RNA. Thirty-one annotated genes were significantly decreased; sixteen (50%) of these were identified only by analyzing 5' capped RNA. Bioinformatic analysis showed that capped RNA produced more consistent results, provided a more extensive expression profile of intronic regions and identified upregulated Pol II transcriptionally active regions in unannotated areas of the genome in HCV cirrhotic liver. Two of these regions were verified by PCR and RACE analysis. qPCR analysis of liver biopsy specimens demonstrated that these unannotated transcripts, as well as IRF1, TRIM22 and MET, were also upregulated in hepatitis C with mild inflammation and no fibrosis. The analysis of 5' capped RNA in combination with ENCODE tiling arrays provides additional gene expression information and identifies novel upregulated Pol II transcripts not previously described in HCV infected liver. This approach, particularly when combined with new RNA sequencing technologies, should also be useful in further defining Pol II transcripts differentially regulated in specific disease states and in studying RNAs regulated by changes in pre-mRNA splicing or 3' polyadenylation status.
PLoS ONE 02/2011; 6(2):e14697. DOI:10.1371/journal.pone.0014697 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: DNA methylation on cytosine in vertebrates such as zebrafish serves to silence gene expression by interfering with the binding of certain transcription factors and through the recruitment of repressive chromatin machinery. Cytosine DNA methylation is chemically stable and heritable through the germline - but also reversible through many modes, making it a useful and dynamic epigenetic modification. Virtually all of the enzymes and factors involved in the deposition, binding, and removal of cytosine methylation are conserved in zebrafish, and therefore the organism an excellent model for understanding the use of DNA methylation in the control of gene regulation and other processes. Here, we discuss the main approaches to quantifying DNA methylation levels genome-wide in zebrafish: one is an established method for revealing regional methylation (methylated DNA immunoprecipitation (MeDIP)), and the other is an emerging method that reveals DNA methylation at base-pair resolution (shotgun bisulphite sequencing). We also introduce some of the analytical methods that are useful for identifying regions of hypo- or hyper-methylation, and ways to identify differentially methylated regions.
[Show abstract][Hide abstract] ABSTRACT: With the rapidly falling cost and availability of high throughput sequencing and microarray technologies, the bottleneck for effectively using genomic analysis in the laboratory and clinic is shifting to one of effectively managing, analyzing, and sharing genomic data.
Here we present three open-source, platform independent, software tools for generating, analyzing, distributing, and visualizing genomic data. These include a next generation sequencing/microarray LIMS and analysis project center (GNomEx); an application for annotating and programmatically distributing genomic data using the community vetted DAS/2 data exchange protocol (GenoPub); and a standalone Java Swing application (GWrap) that makes cutting edge command line analysis tools available to those who prefer graphical user interfaces. Both GNomEx and GenoPub use the rich client Flex/Flash web browser interface to interact with Java classes and a relational database on a remote server. Both employ a public-private user-group security model enabling controlled distribution of patient and unpublished data alongside public resources. As such, they function as genomic data repositories that can be accessed manually or programmatically through DAS/2-enabled client applications such as the Integrated Genome Browser.
These tools have gained wide use in our core facilities, research laboratories and clinics and are freely available for non-profit use. See http://sourceforge.net/projects/gnomex/, http://sourceforge.net/projects/genoviz/, and http://sourceforge.net/projects/useq.
[Show abstract][Hide abstract] ABSTRACT: Years after the discovery that Dicer is a key enzyme in gene silencing, the role of its helicase domain remains enigmatic. Here we show that this domain is critical for accumulation of certain endogenous small interfering RNAs (endo-siRNAs) in Caenorhabditis elegans. The domain is required for the production of the direct products of Dicer, or primary endo-siRNAs, and consequently affects levels of downstream intermediates, the secondary endo-siRNAs. Consistent with the role of endo-siRNAs in silencing, their loss correlates with an increase in cognate mRNA levels. We find that the helicase domain of Dicer is not necessary for microRNA (miRNA) processing, or RNA interference following exposure to exogenous double-stranded RNA. Comparisons of wild-type and helicase-defective strains using deep-sequencing analyses show that the helicase domain is required by a subset of annotated endo-siRNAs, in particular, those associated with the slightly longer 26-nucleotide small RNA species containing a 5' guanosine.
[Show abstract][Hide abstract] ABSTRACT: Because nucleosomes are widely replaced by protamine in mature human sperm, the epigenetic contributions of sperm chromatin to embryo development have been considered highly limited. Here we show that the retained nucleosomes are significantly enriched at loci of developmental importance, including imprinted gene clusters, microRNA clusters, HOX gene clusters, and the promoters of stand-alone developmental transcription and signalling factors. Notably, histone modifications localize to particular developmental loci. Dimethylated lysine 4 on histone H3 (H3K4me2) is enriched at certain developmental promoters, whereas large blocks of H3K4me3 localize to a subset of developmental promoters, regions in HOX clusters, certain noncoding RNAs, and generally to paternally expressed imprinted loci, but not paternally repressed loci. Notably, trimethylated H3K27 (H3K27me3) is significantly enriched at developmental promoters that are repressed in early embryos, including many bivalent (H3K4me3/H3K27me3) promoters in embryonic stem cells. Furthermore, developmental promoters are generally DNA hypomethylated in sperm, but acquire methylation during differentiation. Taken together, epigenetic marking in sperm is extensive, and correlated with developmental regulators.
[Show abstract][Hide abstract] ABSTRACT: High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin immunoprecipitation is used to enrich for particular DNA sequences of interest and signature sequencing is used to map the regions to the genome (ChIP-Seq). Elucidation of these sites of DNA-protein binding/modification are proving instrumental in reconstructing networks of gene regulation and chromatin remodelling that direct development, response to cellular perturbation, and neoplastic transformation.
Here we present a package of algorithms and software that makes use of control input data to reduce false positives and estimate confidence in ChIP-Seq peaks. Several different methods were compared using two simulated spike-in datasets. Use of control input data and a normalized difference score were found to more than double the recovery of ChIP-Seq peaks at a 5% false discovery rate (FDR). Moreover, both a binomial p-value/q-value and an empirical FDR were found to predict the true FDR within 2-3 fold and are more reliable estimators of confidence than a global Poisson p-value. These methods were then used to reanalyze Johnson et al.'s neuron-restrictive silencer factor (NRSF) ChIP-Seq data without relying on extensive qPCR validated NRSF sites and the presence of NRSF binding motifs for setting thresholds.
The methods developed and tested here show considerable promise for reducing false positives and estimating confidence in ChIP-Seq data without any prior knowledge of the chIP target. They are part of a larger open source package freely available from http://useq.sourceforge.net/.
[Show abstract][Hide abstract] ABSTRACT: We have determined the high-resolution strand-specific transcriptome of the fission yeast S. pombe under multiple growth conditions using a novel RNA-DNA hybridization mapping (HybMap) technique. HybMap uses an antibody against an RNA-DNA hybrid to detect RNA molecules hybridized to a high-density DNA oligonucleotide tiling microarray. HybMap showed exceptional dynamic range and reproducibility, and allowed us to identify strand-specific coding, noncoding and structural RNAs, as well as previously unknown RNAs conserved in distant yeast species. Notably, we found that virtually the entire euchromatic genome (including intergenics) is transcribed, with heterochromatin dampening intergenic transcription. We identified features including large numbers of condition-specific noncoding RNAs, extensive antisense transcription, new properties of antisense transcripts and induced divergent transcription. Furthermore, our HybMap data informed the efficiency and locations of RNA splicing genome-wide. Finally, we observed strand-specific transcription islands around tRNAs at heterochromatin boundaries inside centromeres. Here, we discuss these new features in terms of organism fitness and transcriptome evolution.
[Show abstract][Hide abstract] ABSTRACT: The most widely used method for detecting genome-wide protein-DNA interactions is chromatin immunoprecipitation on tiling microarrays, commonly known as ChIP-chip. Here, we conducted the first objective analysis of tiling array platforms, amplification procedures, and signal detection algorithms in a simulated ChIP-chip experiment. Mixtures of human genomic DNA and "spike-ins" comprised of nearly 100 human sequences at various concentrations were hybridized to four tiling array platforms by eight independent groups. Blind to the number of spike-ins, their locations, and the range of concentrations, each group made predictions of the spike-in locations. We found that microarray platform choice is not the primary determinant of overall performance. In fact, variation in performance between labs, protocols, and algorithms within the same array platform was greater than the variation in performance between array platforms. However, each array platform had unique performance characteristics that varied with tiling resolution and the number of replicates, which have implications for cost versus detection power. Long oligonucleotide arrays were slightly more sensitive at detecting very low enrichment. On all platforms, simple sequence repeats and genome redundancy tended to result in false positives. LM-PCR and WGA, the most popular sample amplification techniques, reproduced relative enrichment levels with high fidelity. Performance among signal detection algorithms was heavily dependent on array platform. The spike-in DNA samples and the data presented here provide a stable benchmark against which future ChIP platforms, protocol improvements, and analysis methods can be evaluated.
Genome Research 04/2008; 18(3):393-403. DOI:10.1101/gr.7080508 · 14.63 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Identifying the genomic regions bound by sequence-specific regulatory factors is central both to deciphering the complex DNA cis-regulatory code that controls transcription in metazoans and to determining the range of genes that shape animal morphogenesis. We used whole-genome tiling arrays to map sequences bound in Drosophila melanogaster embryos by the six maternal and gap transcription factors that initiate anterior-posterior patterning. We find that these sequence-specific DNA binding proteins bind with quantitatively different specificities to highly overlapping sets of several thousand genomic regions in blastoderm embryos. Specific high- and moderate-affinity in vitro recognition sequences for each factor are enriched in bound regions. This enrichment, however, is not sufficient to explain the pattern of binding in vivo and varies in a context-dependent manner, demonstrating that higher-order rules must govern targeting of transcription factors. The more highly bound regions include all of the over 40 well-characterized enhancers known to respond to these factors as well as several hundred putative new cis-regulatory modules clustered near developmental regulators and other genes with patterned expression at this stage of embryogenesis. The new targets include most of the microRNAs (miRNAs) transcribed in the blastoderm, as well as all major zygotically transcribed dorsal-ventral patterning genes, whose expression we show to be quantitatively modulated by anterior-posterior factors. In addition to these highly bound regions, there are several thousand regions that are reproducibly bound at lower levels. However, these poorly bound regions are, collectively, far more distant from genes transcribed in the blastoderm than highly bound regions; are preferentially found in protein-coding sequences; and are less conserved than highly bound regions. Together these observations suggest that many of these poorly bound regions are not involved in early-embryonic transcriptional regulation, and a significant proportion may be nonfunctional. Surprisingly, for five of the six factors, their recognition sites are not unambiguously more constrained evolutionarily than the immediate flanking DNA, even in more highly bound and presumably functional regions, indicating that comparative DNA sequence analysis is limited in its ability to identify functional transcription factor targets.
[Show abstract][Hide abstract] ABSTRACT: Significant fractions of eukaryotic genomes give rise to RNA, much of which is unannotated and has reduced protein-coding potential. The genomic origins and the associations of human nuclear and cytosolic polyadenylated RNAs longer than 200 nucleotides (nt) and whole-cell RNAs less than 200 nt were investigated in this genome-wide study. Subcellular addresses for nucleotides present in detected RNAs were assigned, and their potential processing into short RNAs was investigated. Taken together, these observations suggest a novel role for some unannotated RNAs as primary transcripts for the production of short RNAs. Three potentially functional classes of RNAs have been identified, two of which are syntenically conserved and correlate with the expression state of protein-coding genes. These data support a highly interleaved organization of the human transcriptome.