dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions.

Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.
Human Mutation (Impact Factor: 5.05). 04/2011; 32(8):894-9. DOI: 10.1002/humu.21517
Source: PubMed

ABSTRACT With the advance of sequencing technologies, whole exome sequencing has increasingly been used to identify mutations that cause human diseases, especially rare Mendelian diseases. Among the analysis steps, functional prediction (of being deleterious) plays an important role in filtering or prioritizing nonsynonymous SNP (NS) for further analysis. Unfortunately, different prediction algorithms use different information and each has its own strength and weakness. It has been suggested that investigators should use predictions from multiple algorithms instead of relying on a single one. However, querying predictions from different databases/Web-servers for different algorithms is both tedious and time consuming, especially when dealing with a huge number of NSs identified by exome sequencing. To facilitate the process, we developed dbNSFP (database for nonsynonymous SNPs' functional predictions). It compiles prediction scores from four new and popular algorithms (SIFT, Polyphen2, LRT, and MutationTaster), along with a conservation score (PhyloP) and other related information, for every potential NS in the human genome (a total of 75,931,005). It is the first integrated database of functional predictions from multiple algorithms for the comprehensive collection of human NSs. dbNSFP is freely available for download at

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Many scientists complain that the current funding situation is dire. Indeed, there has been an overall decline in support in funding for research from the National Institutes of Health and the National Science Foundation. Within the Drosophila field, some of us question how long this funding crunch will last as it demotivates principal investigators and perhaps more importantly affects the longterm career choice of many young scientists. Yet numerous very interesting biological processes and avenues remain to be investigated in Drosophila, and probing questions can be answered fast and efficiently in flies to reveal new biological phenomena. Moreover, Drosophila is an excellent model organism for studies that have translational impact for genetic disease and for other medical implications such as vector-borne illnesses. We would like to promote a better collaboration between Drosophila geneticists/biologists and human geneticists/bioinformaticians/clinicians, as it would benefit both fields and significantly impact the research on human diseases. Copyright © 2015, The Genetics Society of America.
    Genetics 01/2015; · 4.87 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Pheochromocytomas and paragangliomas (PCC/PGL) are the solid tumour type most commonly associated with an inherited susceptibility syndrome. However, very little is known about the somatic genetic changes leading to tumorigenesis or malignant transformation. Here we perform whole-exome sequencing on a discovery set of 21 PCC/PGL and identify somatic ATRX mutations in two SDHB-associated tumours. Targeted sequencing of a separate validation set of 103 PCC/PGL identifies somatic ATRX mutations in 12.6% of PCC/PGL. PCC/PGL with somatic ATRX mutations are associated with alternative lengthening of telomeres and clinically aggressive behaviour. This finding suggests that loss of ATRX, an SWI/SNF chromatin remodelling protein, is important in the development of clinically aggressive PCC/PGL.
    Nature Communications 01/2015; 6:6140. · 10.74 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Interpreting the impact of human genome variation on phenotype is challenging. The functional effect of protein-coding variants is often predicted using sequence conservation and population frequency data, however other factors are likely relevant. We hypothesized that variants in protein post-translational modification (PTM) sites contribute to phenotype variation and disease. We analyzed fraction of rare variants and non-synonymous to synonymous variant ratio (Ka/Ks) in 7,500 human genomes and found a significant negative selection signal in PTM regions independent of six factors, including conservation, codon usage, and GC-content, that is widely distributed across tissue-specific genes and function classes. PTM regions are also enriched in known disease mutations, suggesting that PTM variation is more likely deleterious. PTM constraint also affects flanking sequence around modified residues and increases around clustered sites, indicating presence of functionally important short linear motifs. Using target site motifs of 124 kinases, we predict that at least ∼180,000 motif-breaker amino acid residues that disrupt PTM sites when substituted, and highlight kinase motifs that show specific negative selection and enrichment of disease mutations. We provide this dataset with corresponding hypothesized mechanisms as a community resource. As an example of our integrative approach, we propose that PTPN11 variants in Noonan syndrome aberrantly activate the protein by disrupting an uncharacterized cluster of phosphorylation sites. Further, as PTMs are molecular switches that are modulated by drugs, we study mutated binding sites of PTM enzymes in disease genes and define a drug-disease network containing 413 novel predicted disease-gene links.
    PLoS Genetics 01/2015; 11(1):e1004919. · 8.17 Impact Factor

Full-text (2 Sources)

Available from
May 27, 2014