-
Harukazu Suzuki,
Alistair R R Forrest,
Erik van Nimwegen,
Carsten O Daub,
Piotr J Balwierz,
Katharine M Irvine,
Timo Lassmann,
Timothy Ravasi,
Yuki Hasegawa,
Michiel J L de Hoon, [......],
Noriko Ninomiya,
Hiromi Nishiyori,
Shohei Noma,
Chihiro Ogawa,
Takuma Sano,
Christophe Simon,
Michihira Tagami,
Yukari Takahashi,
Jun Kawai,
Yoshihide Hayashizaki
[show abstract]
[hide abstract]
ABSTRACT: Using deep sequencing (deepCAGE), the FANTOM4 study measured the genome-wide dynamics of transcription-start-site usage in the human monocytic cell line THP-1 throughout a time course of growth arrest and differentiation. Modeling the expression dynamics in terms of predicted cis-regulatory sites, we identified the key transcription regulators, their time-dependent activities and target genes. Systematic siRNA knockdown of 52 transcription factors confirmed the roles of individual factors in the regulatory network. Our results indicate that cellular states are constrained by complex networks involving both positive and negative regulatory interactions among substantial numbers of transcription factors and that no single transcription factor is both necessary and sufficient to drive the differentiation process.
Nature Genetics 05/2009; 41(5):553-62. · 35.53 Impact Factor
-
Shota Nakamura,
Cheng-Song Yang,
Naomi Sakon,
Mayo Ueda,
Takahiro Tougan,
Akifumi Yamashita,
Naohisa Goto,
Kazuo Takahashi,
Teruo Yasunaga,
Kazuyoshi Ikuta, [......],
Yoshiko Okamoto,
Michihira Tagami,
Ryoji Morita, Norihiro Maeda,
Jun Kawai,
Yoshihide Hayashizaki,
Yoshiyuki Nagai,
Toshihiro Horii,
Tetsuya Iida,
Takaaki Nakaya
[show abstract]
[hide abstract]
ABSTRACT: With the severe acute respiratory syndrome epidemic of 2003 and renewed attention on avian influenza viral pandemics, new surveillance systems are needed for the earlier detection of emerging infectious diseases. We applied a "next-generation" parallel sequencing platform for viral detection in nasopharyngeal and fecal samples collected during seasonal influenza virus (Flu) infections and norovirus outbreaks from 2005 to 2007 in Osaka, Japan. Random RT-PCR was performed to amplify RNA extracted from 0.1-0.25 ml of nasopharyngeal aspirates (N = 3) and fecal specimens (N = 5), and more than 10 microg of cDNA was synthesized. Unbiased high-throughput sequencing of these 8 samples yielded 15,298-32,335 (average 24,738) reads in a single 7.5 h run. In nasopharyngeal samples, although whole genome analysis was not available because the majority (>90%) of reads were host genome-derived, 20-460 Flu-reads were detected, which was sufficient for subtype identification. In fecal samples, bacteria and host cells were removed by centrifugation, resulting in gain of 484-15,260 reads of norovirus sequence (78-98% of the whole genome was covered), except for one specimen that was under-detectable by RT-PCR. These results suggest that our unbiased high-throughput sequencing approach is useful for directly detecting pathogenic viruses without advance genetic information. Although its cost and technological availability make it unlikely that this system will very soon be the diagnostic standard worldwide, this system could be useful for the earlier discovery of novel emerging viruses and bioterrorism, which are difficult to detect with conventional procedures.
PLoS ONE 01/2009; 4(1):e4219. · 4.09 Impact Factor
-
Eivind Valen,
Giovanni Pascarella,
Alistair Chalk, Norihiro Maeda,
Miki Kojima,
Chika Kawazu,
Mitsuyoshi Murata,
Hiromi Nishiyori,
Dejan Lazarevic,
Dario Motti, [......],
Ole Winther,
Takahiro Arakawa,
Jun Kawai,
Christine Wells,
Carsten Daub,
Matthias Harbers,
Yoshihide Hayashizaki,
Stefano Gustincich,
Albin Sandelin,
Piero Carninci
[show abstract]
[hide abstract]
ABSTRACT: Finding and characterizing mRNAs, their transcription start sites (TSS), and their associated promoters is a major focus in post-genome biology. Mammalian cells have at least 5-10 magnitudes more TSS than previously believed, and deeper sequencing is necessary to detect all active promoters in a given tissue. Here, we present a new method for high-throughput sequencing of 5' cDNA tags-DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high-throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth of novel core promoters that are preferentially used in hippocampus: This is the most comprehensive promoter data set for any tissue to date. Using these data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters used only in a small subset of cells within the complex tissue.
Genome Research 01/2009; 19(2):255-65. · 13.61 Impact Factor
-
Shota Nakamura, Norihiro Maeda,
Ionut Mihai Miron,
Myonsun Yoh,
Kaori Izutsu,
Chidoh Kataoka,
Takeshi Honda,
Teruo Yasunaga,
Takaaki Nakaya,
Jun Kawai,
Yoshihide Hayashizaki,
Toshihiro Horii,
Tetsuya Iida
[show abstract]
[hide abstract]
ABSTRACT: To test the ability of high-throughput DNA sequencing to detect bacterial pathogens, we used it on DNA from a patient's feces during and after diarrheal illness. Sequences showing best matches for Campylobacter jejuni were detected only in the illness sample. Various bacteria may be detectable with this metagenomic approach.
Emerging Infectious Diseases 12/2008; 14(11):1784-6. · 6.79 Impact Factor
-
Norihiro Maeda,
Hiromi Nishiyori,
Mari Nakamura,
Chika Kawazu,
Mitsuyoshi Murata,
Hiromi Sano,
Kengo Hayashida,
Shiro Fukuda,
Michihira Tagami,
Akira Hasegawa,
Kayoko Murakami,
Kate Schroder,
Katharine Irvine,
David Hume,
Yoshihide Hayashizaki,
Piero Carninci,
Harukazu Suzuki
[show abstract]
[hide abstract]
ABSTRACT: CAGE (cap analysis of gene expression) is a method for identifying transcription start sites by sequencing the first 20 or 21 nucleotides from the 5' end of capped transcripts, allowing genome-wide promoter analyses to be performed. The potential of the CAGE as a form of expression profiling was limited previously by sequencing technology and the labor-intensive protocol. Here we describe an improved CAGE method for use with a next generation sequencer. This modified method allows the identification of the RNA source of each CAGE tag within a pooled library by introducing DNA tags (barcodes). The method not only drastically improves the sequencing capacity, but also contributes to savings in both time and budget. Additionally, this pooled CAGE tag method enables the dynamic changes in promoter usage and gene expression to be monitored.
BioTechniques 08/2008; 45(1):95-7. · 2.67 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: An assessment of the hybridization characteristics of oligonucleotide tiling arrays was carried out using 162 full-length sequenced cDNA clones in spike-in experiments. The properties of array probes that influence signal intensity were investigated, and their capability in the detection of the cDNA exons was evaluated. The signal intensities detected in exonic and nonexonic genomic regions were examined by focusing on the features of probe sequences that raise or lower the level of intensity and on the causes of false positive signals found in nonexonic regions. The effectiveness of measures used in published protocols to improve the separation between signal and background intensity distributions, including the use of replicates and threshold parameterization of signal intensity, was assessed. Sensitivity and specificity in the detection of exons were measured using various sets of threshold parameters, and the effects of each parameter on the detection efficiency and the rate of false positives were evaluated. It was also demonstrated that hybridization of full-length cDNA clones is an excellent method to investigate the characteristics of oligonucleotide tiling arrays.
Genomics 05/2007; 89(4):541-51. · 3.02 Impact Factor
-
Norihiro Maeda,
Takeya Kasukawa,
Rieko Oyama,
Julian Gough,
Martin Frith,
Pär G Engström,
Boris Lenhard,
Rajith N Aturaliya,
Serge Batalov,
Kirk W Beisel, [......],
Koji Sugiura,
Yoichi Takenaka,
Rohan D Teasdale,
Christine A Wells,
Yunxia Zhu,
Chikatoshi Kai,
Jun Kawai,
David A Hume,
Piero Carninci,
Yoshihide Hayashizaki
[show abstract]
[hide abstract]
ABSTRACT: The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.
PLoS Genetics 05/2006; 2(4):e62. · 8.69 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We have developed a RecA-mediated simple, rapid and scalable method for identifying novel alternatively spliced full-length cDNA candidates. This method is based on the principle that RecA proteins allow to carry radioisotope-labeled probe DNAs to their homologous sequences, resulting in forming triplexes. The resulting complex is easily detected by mobility difference on electrophoresis. We applied this exon profiling method to four selected mouse genes as a feasibility study. To design probes for detection, the information on known exonic regions was extracted from public database, RefSeq. Concerning the potentially transcribed novel exonic regions, RNA mapping experiment using Affymetrix tiling array was performed. As a result, we were able to identify alternative splice variants of Thioredoxin domain containing 5, Interleukin1beta, Interleukin 1 family 6 and glutamine-rich hypothetical protein. In addition, full-length sequencing demonstrated that our method could profile exon structures with >90% accuracy. This reliable method can allow us to screen novel splice variants from a huge number of cDNA clone set effectively.
Nucleic Acids Research 01/2006; 34(13):e97. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: T he international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM(2), comprised 60,770 full- length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein- coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full- length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web- based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full- length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding ( including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full- length cDNAs. The total number of distinct non- protein- coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and. nal expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.
-
Eivind Valen,
Giovanni Pascarella,
Alistair Morgan Chalk, Norihiro Maeda,
Miki Kojima,
Chika Kawazu,
Mitsuyoshi Murata,
Hiromi Nishiyori,
Dejan Lazarevic,
Dario Motti, [......],
Ole Winther,
Takahiro Arakawa,
Jun Kawai,
Christine Wells,
Carsten Daub,
Matthias Harbers,
Yoshihide Hayashizaki,
Stefano Gustincich,
Albin Sandelin,
Piero Carninci
[show abstract]
[hide abstract]
ABSTRACT: Finding and characterizing mRNAs, their transcription start sites (TSS) and their associated promoters is a major focus in post-genome biology. Mammalian cells have at least 5-10 magnitudes more TSS than previously believed, and deeper sequencing is necessary to detect all active promoters in a given tissue. Here, we present a new method for high throughput sequencing of 5' cDNA tags, DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth of novel core promoters which are preferentially used in hippocampus: this is the most comprehensive promoter dataset for any tissue to date. Using this data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters used only in a small subset of cells within the complex tissue. Yes Yes