Deleterious- and Disease-Allele Prevalence in Healthy Individuals: Insights from Current Predictions, Mutation Databases, and Population-Scale Resequencing.

The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
The American Journal of Human Genetics (Impact Factor: 10.99). 12/2012; 91(6):1022-1032. DOI: 10.1016/j.ajhg.2012.10.015
Source: PubMed

ABSTRACT We have assessed the numbers of potentially deleterious variants in the genomes of apparently healthy humans by using (1) low-coverage whole-genome sequence data from 179 individuals in the 1000 Genomes Pilot Project and (2) current predictions and databases of deleterious variants. Each individual carried 281-515 missense substitutions, 40-85 of which were homozygous, predicted to be highly damaging. They also carried 40-110 variants classified by the Human Gene Mutation Database (HGMD) as disease-causing mutations (DMs), 3-24 variants in the homozygous state, and many polymorphisms putatively associated with disease. Whereas many of these DMs are likely to represent disease-allele-annotation errors, between 0 and 8 DMs (0-1 homozygous) per individual are predicted to be highly damaging, and some of them provide information of medical relevance. These analyses emphasize the need for improved annotation of disease alleles both in mutation databases and in the primary literature; some HGMD mutation data have been recategorized on the basis of the present findings, an iterative process that is both necessary and ongoing. Our estimates of deleterious-allele numbers are likely to be subject to both overcounting and undercounting. However, our current best mean estimates of ∼400 damaging variants and ∼2 bona fide disease mutations per individual are likely to increase rather than decrease as sequencing studies ascertain rare variants more effectively and as additional disease alleles are discovered.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent genomic projects have revealed the existence of an unexpectedly large amount of deleterious variability in the human genome. Several hypotheses have been proposed to explain such an apparently high mutational load. However, the mechanisms by which deleterious mutations in some genes cause a pathological effect but are apparently innocuous in other genes remain largely unknown. This study searched for deleterious variants in the 1,000 genomes populations, as well as in a newly sequenced population of 252 healthy Spanish individuals. In addition, variants causative of monogenic diseases and somatic variants from 41 chronic lymphocytic leukaemia patients were analysed. The deleterious variants found were analysed in the context of the interactome to understand the role of network topology in the maintenance of the observed mutational load. Our results suggest that one of the mechanisms whereby the effect of these deleterious variants on the phenotype is suppressed could be related to the configuration of the protein interaction network. Most of the deleterious variants observed in healthy individuals are concentrated in peripheral regions of the interactome, in combinations that preserve their connectivity, and have a marginal effect on interactome integrity. On the contrary, likely pathogenic cancer somatic deleterious variants tend to occur in internal regions of the interactome, often with associated structural consequences. Finally, variants causative of monogenic diseases seem to occupy an intermediate position. Our observations suggest that the real pathological potential of a variant might be more a systems property rather than an intrinsic property of individual proteins.
    Molecular Systems Biology 09/2014; 10(9). DOI:10.15252/msb.20145222 · 14.10 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many new disease genes can be identified through high-throughput sequencing. Yet variant interpretation for the large amounts of genomic data remains a challenge given variation of uncertain significance and genes that lack disease annotation. As clinically-significant disease genes may be subject to negative selection, we developed a prediction method that measures paucity of nonsynonymous variation in the human population to infer gene-based pathogenicity. Integrating human exome data of six thousand individuals from the NHLBI Exome Sequencing Project, we tested the utility of the prediction method based on the ratio of non-synonymous to synonymous substitution rates (dN/dS) on X-chromosome genes. A low dN/dS ratio characterized genes associated with childhood disease and outcome. Furthermore, we identify new candidates for diseases with early mortality and demonstrate intragenic localized patterns of variants that suggest pathogenic hotspots. Our results suggest intra-human substitution analysis is a valuable tool to help prioritize novel disease genes in sequence interpretation.
    Human Molecular Genetics 09/2014; 24(3). DOI:10.1093/hmg/ddu473 · 6.68 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The use of relatively low numbers of sires in cattle breeding programs, particularly on those for carcass and weight traits in Nellore beef cattle (Bos indicus) in Brazil, has always raised concerns about inbreeding, which affects conservation of genetic resources and sustainability of this breed. Here, we investigated the distribution of autozygosity levels based on runs of homozygosity (ROH) in a sample of 1,278 Nellore cows, genotyped for over 777,000 SNPs. We found ROH segments larger than 10 Mb in over 70% of the samples, representing signatures most likely related to the recent massive use of few sires. However, the average genome coverage by ROH (>1 Mb) was lower than previously reported for other cattle breeds (4.58%). In spite of 99.98% of the SNPs being included within a ROH in at least one individual, only 19.37% of the markers were encompassed by common ROH, suggesting that the ongoing selection for weight, carcass and reproductive traits in this population is too recent to have produced selection signatures in the form of ROH. Three short-range highly prevalent ROH autosomal hotspots (occurring in over 50% of the samples) were observed, indicating candidate regions most likely under selection since before the foundation of Brazilian Nellore cattle. The putative signatures of selection on chromosomes 4, 7, and 12 may be involved in resistance to infectious diseases and fertility, and should be subject of future investigation.
    Frontiers in Genetics 01/2015; 6. DOI:10.3389/fgene.2015.00005

Full-text (2 Sources)

Available from
May 23, 2014