Detection of non-neutral substitution rates on Mammalian phylogenies

Gladstone Institutes, University of California, San Francisco, San Francisco, California 94158, USA.
Genome Research (Impact Factor: 14.63). 10/2009; 20(1):110-21. DOI: 10.1101/gr.097857.109
Source: PubMed


Methods for detecting nucleotide substitution rates that are faster or slower than expected under neutral drift are widely used to identify candidate functional elements in genomic sequences. However, most existing methods consider either reductions (conservation) or increases (acceleration) in rate but not both, or assume that selection acts uniformly across the branches of a phylogeny. Here we examine the more general problem of detecting departures from the neutral rate of substitution in either direction, possibly in a clade-specific manner. We consider four statistical, phylogenetic tests for addressing this problem: a likelihood ratio test, a score test, a test based on exact distributions of numbers of substitutions, and the genomic evolutionary rate profiling (GERP) test. All four tests have been implemented in a freely available program called phyloP. Based on extensive simulation experiments, these tests are remarkably similar in statistical power. With 36 mammalian species, they all appear to be capable of fairly good sensitivity with low false-positive rates in detecting strong selection at individual nucleotides, moderate selection in 3-bp elements, and weaker or clade-specific selection in longer elements. By applying phyloP to mammalian multiple alignments from the ENCODE project, we shed light on patterns of conservation/acceleration in known and predicted functional elements, approximate fractions of sites subject to constraint, and differences in clade-specific selection in the primate and glires clades. We also describe new "Conservation" tracks in the UCSC Genome Browser that display both phyloP and phastCons scores for genome-wide alignments of 44 vertebrate species.

20 Reads
  • Source
    • "Whenever needed, we manually investigated the raw sequence reads using the Integrative Genomics Viewer (IGV)[13] to exclude false positives calls. The impact of the selected candidate pathogenic missense mutations on the protein function was assessed using three in silico " pathogenicity " predictor tools such as Polyphen- 2[14], MutationTaster[15] and SIFT[16], accounting also for nucleotide evolutionary conservation across species evaluated by the PhyloP algorithm[17]. Furthermore, in order to verify modifications at protein level the protein models have been created (see Supplementary). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The technological improvements over the last years made considerable progresses in the knowledge of the etiology of Intellectual Disability (ID). However at present very little is known about the genetic heterogeneity underlying the Non-Syndromic form of ID (NS-ID). To investigate the genetic basis of NS-ID we analyzed 43 trios and 22 isolated NS-ID patients using a Targeted Sequencing (TS) approach. 71 NS-ID genes have been selected and sequenced in all subjects. We found putative pathogenic mutations in 7 out of 65 patients. The pathogenic role of mutations was evaluated through sequence comparison and structural analysis was performed to predict the effect of alterations in a 3D computational model through molecular dynamics simulations. Additionally, a deep patient clinical re-evaluation has been performed after the molecular results. This approach allowed us to find novel pathogenic mutations with a detection rate close to 11% in our cohort of patients. This result supports the hypothesis that many NS-ID related genes still remain to be discovered and that NS-ID is a more complex phenotype compared to syndromic form, likely caused by a complex and broad interaction between genes alterations and environment factors.
    Mutation Research - Fundamental and Molecular Mechanisms of Mutagenesis 09/2015; 781. DOI:10.1016/j.mrfmmm.2015.09.002
  • Source
    • "Furthermore, variants were excluded when being synonymous, having a minor allele frequency above 0.001% in Exome Aggregation Consortium (ExAC) public database ( or when not predicted pathogenic by any of the included algorithms: SIFT (Kumar et al., 2009), phyloP (Pollard et al., 2010), PolyPhen-2 (Adzhubei et al., 2010), MutationTaster (Schwarz et al., 2010), LRT (Chun and Fay, 2009), and CADD (Kircher et al., 2014). Additionally, exomes were screened for synonymous MAPT variants in exon 9 and 10. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Early-onset Alzheimer's disease (EOAD) accounts for 1%-2% of all Alzheimer's disease (AD) subjects, with large variation in the reported genetic contribution of known dementia genes. In this pilot study, we genetically characterized a German EOAD cohort (23 subjects) by whole-exome sequencing, capturing variants in all recognized AD and frontotemporal dementia genes. After variant filtering, we identified 7 events of altogether 6 different rare variants in 6 subjects, including 4 novel variants. Four of the 6 variants, observed in 5 different index subjects (5/23 = 22%), were considered to be possibly pathogenic. These included 2 presenilin 2 (PSEN2) variants (p.N141I-previously denoted as a Volga German variant, observed in 2 index subjects; and p.L238P), 1 amyloid precursor protein (p.I716M), and 1 presenilin 1 (ΔE9). Using a control exome data set of 96 ethnically matched neurodegenerative disease controls (Parkinson's disease), we identified only 1 variant (PSEN2 p.T18M) (1%), demonstrating a significantly higher mutational burden in the EOAD group (p > 0.0001). Our findings demonstrate a substantial frequency of variants in dementia genes in EOAD, including several seemingly "sporadic" subjects. This indicates that heritability in EOAD might be higher than assumed. The finding of 3 subjects carrying potential pathogenic PSEN2 variants suggests that, in specific populations PSEN2 variants might be as frequent as (or more frequent than) presenilin 1, for example, in German populations which are influenced by Volga German heritage. Variants in AD genes were also associated with rare phenotypes such as frontal AD or primary progressive aphasia, demonstrating the need to screen AD genes in frontotemporal dementia-like phenotypes.
    Neurobiology of aging 09/2015; DOI:10.1016/j.neurobiolaging.2015.09.016 · 5.01 Impact Factor
  • Source
    • "Specifically, we used a maximum likelihood test (Pollard et al. 2010) to first identify 113,577 DHSs that exhibit significant evolutionary constraint across primates, which manifest as regions of low sequence divergence compared with carefully defined putatively neutral flanking sequence (FDR = 0.01) (Fig. 1). Next, for DHSs that are conserved in primates, we performed a second likelihood ratio test (Pollard et al. 2010) and identified 524 regulatory sequences that have experienced a significant acceleration of evolution in the human lineage and therefore exhibit an excess of human-specific substitutions (FDR = 0.05) (Fig. 1; Supplemental Table 2). "
    [Show abstract] [Hide abstract]
    ABSTRACT: It has long been hypothesized that changes in gene regulation have played an important role in human evolution, but regulatory DNA has been much more difficult to study compared with protein-coding regions. Recent large-scale studies have created genome-scale catalogs of DNase I hypersensitive sites (DHSs), which demark potentially functional regulatory DNA. To better define regulatory DNA that has been subject to human-specific adaptive evolution, we performed comprehensive evolutionary and population genetics analyses on over 18 million DHSs discovered in 130 cell types. We identified 524 DHSs that are conserved in nonhuman primates but accelerated in the human lineage (haDHS), and estimate that 70% of substitutions in haDHSs are attributable to positive selection. Through extensive computational and experimental analyses, we demonstrate that haDHSs are often active in brain or neuronal cell types; play an important role in regulating the expression of developmentally important genes, including many transcription factors such as SOX6, POU3F2, and HOX genes; and identify striking examples of adaptive regulatory evolution that may have contributed to human-specific phenotypes. More generally, our results reveal new insights into conserved and adaptive regulatory DNA in humans and refine the set of genomic substrates that distinguish humans from their closest living primate relatives.
Show more


20 Reads
Available from