Deleterious- and Disease-Allele Prevalence in Healthy Individuals: Insights from Current Predictions, Mutation Databases, and Population-Scale Resequencing

The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
The American Journal of Human Genetics (Impact Factor: 10.93). 12/2012; 91(6):1022-1032. DOI: 10.1016/j.ajhg.2012.10.015
Source: PubMed


We have assessed the numbers of potentially deleterious variants in the genomes of apparently healthy humans by using (1) low-coverage whole-genome sequence data from 179 individuals in the 1000 Genomes Pilot Project and (2) current predictions and databases of deleterious variants. Each individual carried 281-515 missense substitutions, 40-85 of which were homozygous, predicted to be highly damaging. They also carried 40-110 variants classified by the Human Gene Mutation Database (HGMD) as disease-causing mutations (DMs), 3-24 variants in the homozygous state, and many polymorphisms putatively associated with disease. Whereas many of these DMs are likely to represent disease-allele-annotation errors, between 0 and 8 DMs (0-1 homozygous) per individual are predicted to be highly damaging, and some of them provide information of medical relevance. These analyses emphasize the need for improved annotation of disease alleles both in mutation databases and in the primary literature; some HGMD mutation data have been recategorized on the basis of the present findings, an iterative process that is both necessary and ongoing. Our estimates of deleterious-allele numbers are likely to be subject to both overcounting and undercounting. However, our current best mean estimates of ∼400 damaging variants and ∼2 bona fide disease mutations per individual are likely to increase rather than decrease as sequencing studies ascertain rare variants more effectively and as additional disease alleles are discovered.

Download full-text


Available from: David N Cooper
  • Source
    • "Whereas access may be a barrier to acquire GEP, WG/ES creates the opposite challenge of potentially overwhelming providers and patients with information coupled with insufficient resources to managing the data. Most people have approximately 400 potentially damaging variants and, on average, two disease causing mutations (Xue et al. 2012). With over 1 million Canadians living with cancer and another 200 000 diagnosed this year (Canadian Cancer Society 2014), hundreds of thousands of individuals may incidentally discover they are harbouring disease causing mutations in the course of their tailored treatment using WG/ES. "

    Full-text · Article · Nov 2015 · Genome
  • Source
    • "A study on 500 exomes from European and 500 exomes from African-American adults from the National Heart, Lung and Blood Institute (NHLBI) Exome Sequencing project estimated a frequency of 3.4% and 1.2%, respectively, in individuals of European and African descent, of high-penetrance actionable pathogenic or likely pathogenic variants (Dorschner et al. 2013). Furthermore , studies in asymptomatic individuals and analysis of the " 1000 Genomes " project data have indicated that healthy individuals carry many variants that have been classified as pathogenic in mutation databases, raising the issue that such classification is not always accurate (MacArthur et al. 2012;Xue et al. 2012;Cassa et al. 2013). One study has used data from the NHLBI project to question pathogenicity of previously reported X-linked intellectual disability genes (Piton et al. 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: New sequencing methods capable of rapidly analyzing the genome at increasing resolution have transformed diagnosis of single-gene or oligogenic genetic disorders in pediatric and adult medicine. Targeted tests, consisting of disease-focused multigene panels and diagnostic exome sequencing to interrogate the sequence of the coding regions of nearly all genes, are now clinically offered when there is suspicion for an undiagnosed genetic disorder or cancer in children and adults. Implementation of diagnostic exome and genome sequencing tests on invasively and noninvasively obtained fetal DNA samples for prenatal genetic diagnosis is also being explored. We predict that they will become more widely integrated into prenatal care in the near future. Providers must prepare for the practical, ethical, and societal dilemmas that accompany the capacity to generate and analyze large amounts of genetic information about the fetus during pregnancy. Copyright © 2015 Cold Spring Harbor Laboratory Press; all rights reserved.
    Preview · Article · Aug 2015 · Cold Spring Harbor Perspectives in Medicine
  • Source
    • "Finally, the introduction of next-generation sequencing in clinical laboratories is causing an explosion in the number of DNA variants identified in and around genes [Yang et al., 2013]. Unfortunately, interpreting the clinical implications of variants in or near splice sites is challenging as functional annotation of DNA variants in publically available databases is inadequate [Xue et al., 2012]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Assessment of the functional consequences of variants near splice sites is a major challenge in the diagnostic laboratory. To address this issue, we created Expression Minigenes (EMGs) to determine the RNA and protein products generated by splice site variants (n = 10) implicated in cystic fibrosis (CF). Experimental results were compared with the splicing predictions of eight in silico tools. EMGs containing the full-length Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) coding sequence and flanking intron sequences generated wild-type transcript and fully processed protein in Human Embryonic Kidney (HEK293) and CF bronchial epithelial (CFBE41o-) cells. Quantification of variant induced aberrant mRNA isoforms was concordant using fragment analysis and pyrosequencing. The splicing patterns of c.1585–1G>A and c.2657+5G>A were comparable to those reported in primary cells from individuals bearing these variants. Bioinformatics predictions were consistent with experimental results for 9/10 variants (MES), 8/10 variants (NNSplice) and 7/10 variants (SSAT and Sroogle). Programs that estimate the consequences of mis-splicing predicted 11/16 (HSF and ASSEDA) and 10/16 (Fsplice and SplicePort) experimentally observed mRNA isoforms. EMGs provide a robust experimental approach for clinical interpretation of splice site variants and refinement of in silico tools.This article is protected by copyright. All rights reserved
    Full-text · Article · Oct 2014 · Human Mutation
Show more