Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Nature (Impact Factor: 38.6). 07/2007; 447(7146):799-816. DOI: 10.1038/nature05874
Source: PubMed

ABSTRACT We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Identifying functional non-coding variants is one of the greatest unmet challenges in genetics. To help address this, we introduce an R package, SuRFR, which integrates functional annotation and prior biological knowledge to prioritise candidate functional variants. SuRFR is publicly available, modular, flexible, fast, and simple to use. We demonstrate that SuRFR performs with high sensitivity and specificity and provide a widely applicable and scalable benchmarking dataset for model training and validation. Website:
    Genome Medicine 01/2014; 6(10):79. · 4.94 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The majority of the human genome is transcribed, even though only 2% of transcripts encode proteins. Non-coding transcripts were originally dismissed as evolutionary junk or transcriptional noise, but with the development of whole genome technologies, these non-coding RNAs (ncRNAs) are emerging as molecules with vital roles in regulating gene expression. While shorter ncRNAs have been extensively studied, the functional roles of long ncRNAs (lncRNAs) are still being elucidated. Studies over the last decade show that lncRNAs are emerging as new players in a number of diseases including cancer. Potential roles in both oncogenic and tumor suppressive pathways in cancer have been elucidated, but the biological functions of the majority of lncRNAs remain to be identified. Accumulated data are identifying the molecular mechanisms by which lncRNA mediates both structural and functional roles. LncRNA can regulate gene expression at both transcriptional and post-transcriptional levels, including splicing and regulating mRNA processing, transport, and translation. Much current research is aimed at elucidating the function of lncRNAs in breast cancer and mammary gland development, and at identifying the cellular processes influenced by lncRNAs. In this paper we review current knowledge of lncRNAs contributing to these processes and present lncRNA as a new paradigm in breast cancer development.
    Frontiers in Genetics 01/2014; 5:379.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Although growth rate is one of the main economic traits of concern in pig production, there is limited knowledge on its epigenetic regulation, such as DNA methylation. In this study, we conducted methyl-CpG binding domain protein-enriched genome sequencing (MBD-seq) to compare genome-wide DNA methylation profile of small intestine and liver tissue between fast- and slow-growing weaning piglets. The genome-wide methylation pattern between the two different growing groups showed similar proportion of CpG (regions of DNA where a cytosine nucleotide occurs next to a guanine nucleotide in the linear sequence) coverage, genomic regions, and gene regions. Differentially methylated regions and genes were also identified for downstream analysis. In canonical pathway analysis using differentially methylated genes, pathways (triacylglycerol pathway, some cell cycle related pathways, and insulin receptor signaling pathway) expected to be related to growth rate were enriched in the two organ tissues. Differentially methylated genes were also organized in gene networks related to the cellular development, growth, and carbohydrate metabolism. Even though further study is required, the result of this study may contribute to the understanding of epigenetic regulation in pig growth.
    Asian Australasian Journal of Animal Sciences 11/2014; 27(11):1532-9. · 0.64 Impact Factor

Full-text (2 Sources)

Available from
May 20, 2014