Gene inactivation and its implications for annotation in the era of personal genomics.

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA.
Genes & development (Impact Factor: 12.64). 01/2011; 25(1):1-10. DOI: 10.1101/gad.1968411
Source: PubMed

ABSTRACT The first wave of personal genomes documents how no single individual genome contains the full complement of functional genes. Here, we describe the extent of variation in gene and pseudogene numbers between individuals arising from inactivation events such as premature termination or aberrant splicing due to single-nucleotide polymorphisms. This highlights the inadequacy of the current reference sequence and gene set. We present a proposal to define a reference gene set that will remain stable as more individuals are sequenced. In particular, we recommend that the ancestral allele be used to define the reference sequence from which a core human reference gene annotation set can be derived. In addition, we call for the development of an expanded gene set to include human-specific genes that have arisen recently and are absent from the ancestral set.

1 Follower
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background & Aims Little is known about the genetic factors that contribute to the development of sessile serrated adenomas (SSAs). SSAs contain somatic mutations in BRAF or KRAS early in development. However, evidence from humans and mouse models indicates that these mutations result in oncogene-induced senescence (OIS) of intestinal crypt cells. Progression to serrated neoplasia requires cells to escape OIS via inactivation of tumor suppressor pathways. We investigated whether subjects with multiple SSAs carry germline loss-of-function mutations (nonsense and splice site) in genes that regulate OIS: the p16-Rb and ATM-ATR DNA damage response pathways. Methods Through a bioinformatic analysis of the literature, we identified a set of genes that function at the main nodes of the p16-Rb and ATM-ATR DNA damage response pathways. We performed whole-exome sequencing of 20 unrelated subjects with multiple SSAs; most had features of serrated polyposis. We compared sequences with those from 4300 subjects matched for ethnicity (controls). We also used an integrative genomics approach to identify additional genes involved in senescence mechanisms. Results We identified mutations in genes that regulate senescence (ATM, PIF1, TELO2, XAF1, and RBL1) in 5 of 20 subjects with multiple SSAs (odds ratio, 3.0; 95% confidence interval, 0.9–8.9; P = .04). In 2 subjects, we found nonsense mutations in RNF43, indicating that it is also associated with multiple serrated polyps (odds ratio, 460; 95% confidence interval, 23.1–16,384; P = 6.8 × 10−5). In knockdown experiments with pancreatic duct cells exposed to UV light, RNF43 appeared to function as a regulator of ATM-ATR DNA damage response. Conclusions We associated germline loss-of-function variants in genes that regulate senescence pathways with the development of multiple SSAs. We identified RNF43 as a regulator of the DNA damage response and associated nonsense variants in this gene with a high risk of developing SSAs.
    Gastroenterology 01/2013; 146(2). DOI:10.1053/j.gastro.2013.10.045 · 13.93 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: When messenger RNA splicing occurs co-transcriptionally, the potential for kinetic control based on transcription dynamics is widely recognized. Indeed, perturbation studies have reported that when transcription kinetics are perturbed genetically or pharmacologically splice patterns may change. However, whether kinetic control is contributing to the control of splicing within the normal range of physiological conditions remains unknown. We examined if the kinetic determinants for co-transcriptional splicing (CTS) might be reflected in the structure and expression patterns of the genome and epigenome. To identify and then quantitatively relate multiple, simultaneous CTS determinants, we constructed a scalable mathematical model of the kinetic interplay of RNA synthesis and CTS and parameterized it with diverse next generation sequencing (NGS) data. We thus found a variety of CTS determinants encoded in vertebrate genomes and epigenomes, and that these combine variously for different groups of genes such as housekeeping versus regulated genes. Together, our findings indicate that the kinetic basis of splicing is functionally and physiologically relevant, and may meaningfully inform the analysis of genomic and epigenomic data to provide insights that are missed when relying on statistical approaches alone. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
    Nucleic Acids Research 12/2014; 43(2). DOI:10.1093/nar/gku1338 · 8.81 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Although numerous approaches have been pursued to understand the function of human genes, Mendelian genetics has by far provided the most compelling and medically actionable dataset. Biallelic loss-of-function (LOF) mutations are observed in the majority of autosomal recessive Mendelian disorders, representing natural human knockouts and offering a unique opportunity to study the physiological and developmental context of these genes. The restriction of such context to 'disease' states is artificial, however, and the recent ability to survey entire human genomes for biallelic LOF mutations has revealed a surprising landscape of knockout events in 'healthy' individuals, sparking interest in their role in phenotypic diversity beyond disease causation. As I discuss in this review, the potentially wide implications of human knockout research warrant increased investment and multidisciplinary collaborations to overcome existing challenges and reap its benefits. Copyright © 2014 Elsevier Ltd. All rights reserved.
    Trends in Genetics 12/2014; DOI:10.1016/j.tig.2014.11.003 · 11.60 Impact Factor


1 Download
Available from