Thomas Gingeras

Thomas Gingeras
Cold Spring Harbor Laboratory | CSHL · Cancer Centre

Ph,D.

About

315
Publications
79,187
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
120,735
Citations
Introduction
Skills and Expertise

Publications

Publications (315)
Article
Sorghum bicolor (L.) Moench is a significant grass crop globally, known for its genetic diversity. High quality genome sequences are needed to capture the diversity. We constructed high-quality, chromosome-level genome assemblies for two vital sorghum inbred lines, Tx2783 and RTx436. Through advanced single-molecule techniques, long-read sequencing...
Preprint
Full-text available
The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin str...
Article
Full-text available
The biological importance of RNA has expanded as our appreciation of the complexity of its multiple types, structures, chemical compositions and biological roles. Research in RNA has been instrumental in revealing insights into fundamental biological processes including: the organization of information within genomes, the mechanisms of control of g...
Article
Full-text available
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read p...
Preprint
Full-text available
Scientists have been trying to identify all of the genes in the human genome since the initial draft of the genome was published in 2001. Over the intervening years, much progress has been made in identifying protein-coding genes, and the estimated number has shrunk to fewer than 20,000, although the number of distinct protein-coding isoforms has e...
Article
Full-text available
Scientists have been trying to identify all of the genes in the human genome since the initial draft of the genome was published in 2001. Over the intervening years, much progress has been made in identifying protein-coding genes, and the estimated number has shrunk to fewer than 20,000, although the number of distinct protein-coding isoforms has e...
Article
Genes specifying long non-coding RNAs (lncRNAs) occupy a large fraction of the genomes of complex organisms. The term ‘lncRNAs’ encompasses RNA polymerase I (Pol I), Pol II and Pol III transcribed RNAs, and RNAs from processed introns. The various functions of lncRNAs and their many isoforms and interleaved relationships with other genes make lncRN...
Article
Full-text available
Dyed roots reveal inner complexity Plant roots do so much more than just hold a plant up. As a site for air storage during flooding, mycorrhizal symbiosis, or carbohydrate storage, the more complex root can tap more complicated functions. Taking advantage of a dye that stains less the deeper it penetrates the tissue, Ortiz-Ramírez et al . applied f...
Article
Full-text available
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassig...
Preprint
Full-text available
Most plant roots have multiple cortex layers that make up the bulk of the organ and play key roles in physiology, such as flood tolerance and symbiosis. However, little is known about the formation of cortical layers outside of the highly reduced anatomy of the model Arabidopsis . Here we use single-cell RNAseq to rapidly generate a cell resolution...
Article
Evaluating the impact of genetic variants on transcriptional regulation is a central goal in biological science that has been constrained by reliance on a single reference genome. To address this, we constructed phased, diploid genomes for four cadaveric donors (using long-read sequencing) and systematically charted noncoding regulatory elements an...
Preprint
Full-text available
Evaluating the impact of genetic variants on transcriptional regulation is a central goal in biological science that has been constrained by reliance on a single reference genome. To address this, we constructed phased, diploid genomes for four cadaveric donors (using long-read sequencing) and systematically charted noncoding regulatory elements an...
Article
Full-text available
As a means to understand human neuropsychiatric disorders from human brain samples, we compared the transcription patterns and histological features of postmortem brain to fresh human neocortex isolated immediately following surgical removal. Compared to a number of neuropsychiatric disease-associated postmortem transcriptomes, the fresh human brai...
Preprint
Full-text available
Sorghum bicolor, one of the most important grass crops around the world, harbors a high degree of genetic diversity. We constructed chromosome-level genome assemblies for two important sorghum inbred lines, Tx2783 and RTx436. The final high-quality reference assemblies consist of 19 and 18 scaffolds, respectively, with contig N50 values of 25.6 and...
Article
Crop productivity depends on activity of meristems that produce optimized plant architectures, including that of the maize ear. A comprehensive understanding of development requires insight into the full diversity of cell types and developmental domains and the gene networks required to specify them. Until now, these were identified primarily by mo...
Article
Full-text available
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin struct...
Article
Full-text available
Extracellular RNAs participate in intercellular communication, and are being studied as promising minimally invasive diagnostic markers. Several studies in recent years showed that tRNA halves and distinct Y RNA fragments are abundant in the extracellular space, including in biofluids. While their regulatory and diagnostic potential has gained a su...
Article
Full-text available
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms...
Article
Full-text available
We have produced RNA sequencing data for 53 primary cells from different locations in the human body. The clustering of these primary cells reveals that most cells in the human body share a few broad transcriptional programs, which define five major cell types: epithelial, endothelial, mesenchymal, neural, and blood cells. These act as basic compon...
Article
Full-text available
MaizeCODE is a project aimed at identifying and analyzing functional elements in the maize genome. In its initial phase, MaizeCODE assayed up to five tissues from four maize strains (B73, NC350, W22, TIL11) by RNA-Seq, Chip-Seq, RAMPAGE, and small RNA sequencing. To facilitate reproducible science and provide both human and machine access to the Ma...
Preprint
Full-text available
We have produced RNA sequencing data for a number of primary cells from different locations in the human body. The clustering of these primary cells reveals that most cells in the human body share a few broad transcriptional programs, which define five major cell types: epithelial, endothelial, mesenchymal, neural and blood cells. These act as basi...
Article
Full-text available
Alu elements are one of the most successful families of transposons in the human genome. A portion of Alu elements is transcribed by RNA Pol III, whereas the remaining ones are part of Pol II transcripts. Because Alu elements are highly repetitive, it has been difficult to identify the Pol III-transcribed elements and quantify their expression leve...
Article
Long noncoding RNAs (lncRNAs) can regulate target gene expression by acting in cis (locally) or in trans (non-locally). Here, we performed genome-wide expression analysis of Toll-like receptor (TLR)-stimulated human macrophages to identify pairs of cis-acting lncRNAs and protein-coding genes involved in innate immunity. A total of 229 gene pairs we...
Article
Full-text available
Long noncoding RNAs (lncRNAs) can regulate target gene expression by acting in cis (locally) or in trans (non-locally). Here, we performed genome-wide expression analysis of Toll-like receptor (TLR)-stimulated human macrophages to identify pairs of cis-acting lncRNAs and protein-coding genes involved in innate immunity. A total of 229 gene pairs we...
Preprint
Full-text available
MicroRNAs (miRNAs) play a critical role as post-transcriptional regulators of gene expression. The ENCODE project profiled the expression of miRNAs in a comprehensive set of tissues during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct tissue and developmental stage specific miRNA...
Article
Full-text available
MicroRNAs (miRNAs) play a critical role as post-transcriptional regulators of gene expression. The ENCODE project profiled the expression of miRNAs in a comprehensive set of tissues during a time-course of mouse embryonic development and captured the expression dynamics of 785 miRNAs. We found distinct tissue and developmental stage specific miRNA...
Article
Full-text available
Many tools are available for RNA-seq alignment and expression quantification, with comparative value being hard to establish. Benchmarking assessments often highlight methods' good performance, but are focused on either model data or fail to explain variation in performance. This leaves us to ask, what is the most meaningful way to assess different...
Preprint
Full-text available
Many tools are available for RNA-seq alignment and expression quantification, with comparative value being hard to establish. Benchmarking assessments often highlight methods’ good performance, but are focused on either model data or fail to explain variation in performance. This leaves us to ask, what is the most meaningful way to assess different...
Article
Full-text available
Extracellular RNA (exRNA) has emerged as an important transducer of intercellular communication. Advancing exRNA research promises to revolutionize biology and transform clinical practice. Recent efforts have led to cutting-edge research and expanded knowledge of this new paradigm in cell-to-cell crosstalk; however, gaps in our understanding of EV...
Article
Full-text available
Multicellular development is driven by regulatory programs that orchestrate the transcription of protein-coding and noncoding genes. To decipher this genomic regulatory code, and to investigate the developmental relevance of noncoding transcription, we compared genome-wide promoter activity throughout embryogenesis in 5 Drosophila species. Core pro...
Data
This table summarizes the most enriched Gene Ontology (GO) functional annotation categories for each gene expression cluster described in Figure 6—figure supplement 3.
Article
Full-text available
Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete—many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRN...
Preprint
Full-text available
Accurate annotations of genes and their transcripts is a foundation of genomics, but no annotation technique presently combines throughput and accuracy. As a result, reference gene collections remain incomplete: many gene models are fragmentary, while thousands more remain uncatalogued–particularly for long noncoding RNAs (lncRNAs). To accelerate l...
Article
Full-text available
In the FANTOM5 project, transcription initiation events across the human and mouse genomes were mapped at a single base-pair resolution and their frequencies were monitored by CAGE (Cap Analysis of Gene Expression) coupled with single-molecule sequencing. Approximately three thousands of samples, consisting of a variety of primary cells, tissues, c...
Preprint
Full-text available
Multicellular development is largely determined by transcriptional regulatory programs that orchestrate the expression of thousands of protein-coding and noncoding genes. To decipher the genomic regulatory code that specifies these programs, and to investigate globally the developmental relevance of noncoding transcription, we profiled genome-wide...
Article
Cross-species comparisons of genomes, transcriptomes and gene regulation are now feasible at unprecedented resolution and throughput, enabling the comparison of human and mouse biology at the molecular level. Insights have been gained into the degree of conservation between human and mouse at the level of not only gene expression but also epigeneti...
Preprint
Full-text available
Motivation Fusion genes created by genomic rearrangements can be potent drivers of tumorigenesis. However, accurate identification of functionally fusion genes from genomic sequencing requires whole genome sequencing, since exonic sequencing alone is often insufficient. Transcriptome sequencing provides a direct, highly effective alternative for ca...
Chapter
Recent advances in high-throughput sequencing technology made it possible to probe the cell transcriptomes by generating hundreds of millions of short reads which represent the fragments of the transcribed RNA molecules. The first and the most crucial task in the RNA-seq data analysis is mapping of the reads to the reference genome. STAR (Spliced T...
Article
Full-text available
Background: A comparison of transcriptional profiles derived from different tissues in a given species or among different species assumes that commonalities reflect evolutionarily conserved programs and that differences reflect species or tissue responses to environmental conditions or developmental program staging. Apparently conflicting results...
Article
Full-text available
Extracellular vesicles (EVs) have been proposed as a means to promote intercellular communication. We show that when human primary cells are exposed to cancer cell EVs, rapid cell death of the primary cells is observed, while cancer cells treated with primary or cancer cell EVs do not display this response. The active agents that trigger cell death...
Article
Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrang...
Article
Full-text available
Mice have been a long-standing model for human biology and disease. Here we characterize, by RNA sequencing, the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles in human cell lines reveals subst...
Article
Full-text available
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-ass...
Article
Full-text available
The laboratory mouse shares the majority of its protein-coding genes with humans, making it the premier model organism in biomedical research, yet the two mammals differ in significant ways. To gain greater insights into both shared and species-specific transcriptional and cellular regulatory programs in the mouse, the Mouse ENCODE Consortium has m...
Article
Full-text available
Significance To date, various studies have found similarities between humans and mice on a molecular level, and indeed, the murine model serves as an important experimental system for biomedical science. In this study of a broad number of tissues between humans and mice, high-throughput sequencing assays on the transcriptome and epigenome reveal th...
Preprint
Full-text available
We characterized by RNA-seq the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles obtained in human cell lines reveals substantial conservation of transcriptional programs, and uncovers a distinct...
Article
Full-text available
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow compariso...
Article
Full-text available
LETTER Reply to Brunet and Doolittle: Both selected effect and causal role elements can influence human biology and disease We agree with Brunet and Doolittle (1) on the utility of distinguishing the evolutionarily selected effects (SE) of some genomic elements from the causal roles (CR) of other elements that lack signatures of selection (1–4). DN...
Article
Full-text available
Most RNA molecules are co- or post-transcriptionally modified to alter their chemical and functional properties to assist in their ultimate biological function. Among these modifications, the addition of 5' cap structure has been found to regulate turnover and localization. Here we report a study of the cap structure of human short (<200 nt) RNAs (...
Article
Full-text available
MiRNAs bear an increasing number of functions throughout development and in the aging adult. Here we address their role in establishing sexually dimorphic traits and sexual identity in male and female Drosophila. Our survey of miRNA populations in each sex identifies sets of miRNAs differentially expressed in male and female tissues across various...
Article
Full-text available
With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in ma...
Article
Naturally occurring regulatory T (Treg) cells, which specifically express the transcription factor forkhead box P3 (Foxp3), are engaged in the maintenance of immunological self-tolerance and homeostasis. By transcriptional start site cluster analysis, we assessed here how genome-wide patterns of DNA methylation or Foxp3 binding sites were associate...
Article
Full-text available
Enhancers control the correct temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. Here we use the FANTOM5 panel of samples, covering the majority of human tissues and cell types...