Deleterious- and Disease-Allele Prevalence in Healthy Individuals: Insights from Current Predictions, Mutation Databases, and Population-Scale Resequencing.

The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
The American Journal of Human Genetics (Impact Factor: 11.2). 12/2012; 91(6):1022-1032. DOI: 10.1016/j.ajhg.2012.10.015
Source: PubMed

ABSTRACT We have assessed the numbers of potentially deleterious variants in the genomes of apparently healthy humans by using (1) low-coverage whole-genome sequence data from 179 individuals in the 1000 Genomes Pilot Project and (2) current predictions and databases of deleterious variants. Each individual carried 281-515 missense substitutions, 40-85 of which were homozygous, predicted to be highly damaging. They also carried 40-110 variants classified by the Human Gene Mutation Database (HGMD) as disease-causing mutations (DMs), 3-24 variants in the homozygous state, and many polymorphisms putatively associated with disease. Whereas many of these DMs are likely to represent disease-allele-annotation errors, between 0 and 8 DMs (0-1 homozygous) per individual are predicted to be highly damaging, and some of them provide information of medical relevance. These analyses emphasize the need for improved annotation of disease alleles both in mutation databases and in the primary literature; some HGMD mutation data have been recategorized on the basis of the present findings, an iterative process that is both necessary and ongoing. Our estimates of deleterious-allele numbers are likely to be subject to both overcounting and undercounting. However, our current best mean estimates of ∼400 damaging variants and ∼2 bona fide disease mutations per individual are likely to increase rather than decrease as sequencing studies ascertain rare variants more effectively and as additional disease alleles are discovered.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent genomic projects have revealed the existence of an unexpectedly large amount of deleterious variability in the human genome. Several hypotheses have been proposed to explain such an apparently high mutational load. However, the mechanisms by which deleterious mutations in some genes cause a pathological effect but are apparently innocuous in other genes remain largely unknown. This study searched for deleterious variants in the 1,000 genomes populations, as well as in a newly sequenced population of 252 healthy Spanish individuals. In addition, variants causative of monogenic diseases and somatic variants from 41 chronic lymphocytic leukaemia patients were analysed. The deleterious variants found were analysed in the context of the interactome to understand the role of network topology in the maintenance of the observed mutational load. Our results suggest that one of the mechanisms whereby the effect of these deleterious variants on the phenotype is suppressed could be related to the configuration of the protein interaction network. Most of the deleterious variants observed in healthy individuals are concentrated in peripheral regions of the interactome, in combinations that preserve their connectivity, and have a marginal effect on interactome integrity. On the contrary, likely pathogenic cancer somatic deleterious variants tend to occur in internal regions of the interactome, often with associated structural consequences. Finally, variants causative of monogenic diseases seem to occupy an intermediate position. Our observations suggest that the real pathological potential of a variant might be more a systems property rather than an intrinsic property of individual proteins.
    Molecular Systems Biology 09/2014; 10(9). DOI:10.15252/msb.20145222 · 14.10 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many new disease genes can be identified through high-throughput sequencing. Yet variant interpretation for the large amounts of genomic data remains a challenge given variation of uncertain significance and genes that lack disease annotation. As clinically-significant disease genes may be subject to negative selection, we developed a prediction method that measures paucity of nonsynonymous variation in the human population to infer gene-based pathogenicity. Integrating human exome data of six thousand individuals from the NHLBI Exome Sequencing Project, we tested the utility of the prediction method based on the ratio of non-synonymous to synonymous substitution rates (dN/dS) on X-chromosome genes. A low dN/dS ratio characterized genes associated with childhood disease and outcome. Furthermore, we identify new candidates for diseases with early mortality and demonstrate intragenic localized patterns of variants that suggest pathogenic hotspots. Our results suggest intra-human substitution analysis is a valuable tool to help prioritize novel disease genes in sequence interpretation.
    Human Molecular Genetics 09/2014; 24(3). DOI:10.1093/hmg/ddu473 · 6.68 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The NEU1 gene is the first identified member of the human sialidases, glycohydrolitic enzymes that remove the terminal sialic acid from oligosaccharide chains. Mutations in NEU1 gene are causative of sialidosis (MIM 256550), a severe lysosomal storage disorder showing autosomal recessive mode of inheritance. Sialidosis has been classified into two subtypes: sialidosis type I, a normomorphic, late-onset form, and sialidosis type II, a more severe neonatal or early-onset form. A total of 50 causative mutations are reported in HGMD database, most of which are missense variants. To further characterize the NEU1 gene and identify new functionally relevant protein isoforms, we decided to study its genetic variability in the human population using the data generated by two large sequencing projects: the 1000 Genomes Project (1000G) and the NHLBI GO Exome Sequencing Project (ESP). Together these two datasets comprise a cohort of 7595 sequenced individuals, making it possible to identify rare variants and dissect population specific ones. By integrating this approach with biochemical and cellular studies, we were able to identify new rare missense and frameshift alleles in NEU1 gene. Among the 9 candidate variants tested, only two resulted in significantly lower levels of sialidase activity (p<0.05), namely c.650T>C and c.700G>A. These two mutations give rise to the amino acid substitutions p.V217A and p.D234N, respectively. NEU1 variants including either of these two amino acid changes have 44% and 25% residual sialidase activity when compared to the wild-type enzyme, reduced protein levels and altered subcellular localization. Thus they may represent new, putative pathological mutations resulting in sialidosis type I. The in silico approach used in this study has enabled the identification of previously unknown NEU1 functional alleles that are widespread in the population and could be tested in future functional studies.
    PLoS ONE 08/2014; 9(8):e104229. DOI:10.1371/journal.pone.0104229 · 3.53 Impact Factor

Full-text (2 Sources)

Available from
May 23, 2014