Analytical methods for inferring functional effects of single base pair substitutions in human cancers

Department of Bioinformatics, Genentech, Inc., 1 DNA Way, M.S. 93, South San Francisco, CA 94080, USA.
Human Genetics (Impact Factor: 4.82). 06/2009; 126(4):481-98. DOI: 10.1007/s00439-009-0677-y
Source: PubMed


Cancer is a genetic disease that results from a variety of genomic alterations. Identification of some of these causal genetic events has enabled the development of targeted therapeutics and spurred efforts to discover the key genes that drive cancer formation. Rapidly improving sequencing and genotyping technology continues to generate increasingly large datasets that require analytical methods to identify functional alterations that deserve additional investigation. This review examines statistical and computational approaches for the identification of functional changes among sets of single-nucleotide substitutions. Frequency-based methods identify the most highly mutated genes in large-scale cancer sequencing efforts while bioinformatics approaches are effective for independent evaluation of both non-synonymous mutations and polymorphisms. We also review current knowledge and tools that can be utilized for analysis of alterations in non-protein-coding genomic sequence.

13 Reads
  • Source
    • "These methods are now commonly used to assess the potential functional consequences of discovered somatic variants in genome-wide scans. Depending on the specifics of training data and the availability of functional information, they can perform strongly (Kaminker et al. 2007; Torkamani and Schork 2008; Carter et al. 2009; Lee et al. 2009a). Many of the bioinformatic approaches to predict the functional effects of missense mutations were initially developed for germline variation and have been applied to better understand common and rare disease variants as well as evolution (Ng and Henikoff 2001; Ramensky et al. 2002; Bromberg and Rost 2007; Kryukov et al. 2007). "
    [Show abstract] [Hide abstract]
    ABSTRACT: A key goal in cancer research is to find the genomic alterations that underlie malignant cells. Genomics has proved successful in identifying somatic variants at a large scale. However, it has become evident that a typical cancer exhibits a heterogenous mutation pattern across samples. Cases where the same alteration is observed repeatedly seem to be the exception rather than the norm. Thus, pinpointing the key alterations (driver mutations) from a background of variations with no direct causal link to cancer (passenger mutations) is difficult. Here we analyze somatic missense mutations from cancer samples and their healthy tissue counterparts (germline mutations) from the viewpoint of germline fitness. We calibrate a scoring system from protein domain alignments to score mutations and their target loci. We show first that this score predicts to a good degree the rate of polymorphism of the observed germline variation. The scoring is then applied to somatic mutations. We show that candidate cancer genes prone to copy number loss harbor mutations with germline fitness effects that are significantly more deleterious than expected by chance. This suggests that missense mutations play a driving role in tumor suppressor genes. Furthermore, these mutations fall preferably onto loci in sequence neighborhoods that are high scoring in terms of germline fitness. In contrast, for somatic mutations in candidate onco genes we do not observe a statistically significant effect. These results help to inform how to exploit germline fitness predictions in discovering new genes and mutations responsible for cancer.
    Genetics 03/2011; 188(2):383-93. DOI:10.1534/genetics.111.127480 · 5.96 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Understanding and predicting molecular cause of disease is one of the major challenges for biology and medicine. One particular area of interest continues to be computational analyses of disease-associated amino acid substitutions. To this end, various studies have been performed to identify molecular functions disrupted by disease-causing mutations. Here, we investigate the influence of disease-associated mutations on post-translational modifications. In particular, we study the loss of modification target sites as a consequence of disease mutation. We find that about 5% of disease-associated mutations may affect known modification sites, either partially (4%) of fully (1%), compared to about 2% of putatively neutral polymorphisms. Most of the fifteen post-translational modification types analyzed were found to be disrupted at levels higher than expected by chance. Molecular functions and physiochemical properties at sites of disease mutation were also compared to those of neutral polymorphisms involved in the process of post-translational modification site disruption. Disease-associated mutations in the neighborhood of post-translationally modified sites were found to be enriched in mutations that change polarity, charge, and hydrophobicity of the wild-type amino acids. Overall, these results further suggest that disruption of modification sites is an important but not the major cause of human genetic disease.
    Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing 01/2010; 15:337-47. DOI:10.1142/9789814295291_0036
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genomic instability is a characteristic of most cancers. In hereditary cancers, genomic instability results from mutations in DNA repair genes and drives cancer development, as predicted by the mutator hypothesis. In sporadic (non-hereditary) cancers the molecular basis of genomic instability remains unclear, but recent high-throughput sequencing studies suggest that mutations in DNA repair genes are infrequent before therapy, arguing against the mutator hypothesis for these cancers. Instead, the mutation patterns of the tumour suppressor TP53 (which encodes p53), ataxia telangiectasia mutated (ATM) and cyclin-dependent kinase inhibitor 2A (CDKN2A; which encodes p16INK4A and p14ARF) support the oncogene-induced DNA replication stress model, which attributes genomic instability and TP53 and ATM mutations to oncogene-induced DNA damage.
    Nature Reviews Molecular Cell Biology 03/2010; 11(3):220-8. DOI:10.1038/nrm2858 · 37.81 Impact Factor
Show more

Preview (2 Sources)

13 Reads
Available from