Exonic Transcription Factor Binding Directs Codon Choice and Affects Protein Evolution.
ABSTRACT Genomes contain both a genetic code specifying amino acids and a regulatory code specifying transcription factor (TF) recognition sequences. We used genomic deoxyribonuclease I footprinting to map nucleotide resolution TF occupancy across the human exome in 81 diverse cell types. We found that ~15% of human codons are dual-use codons ("duons") that simultaneously specify both amino acids and TF recognition sites. Duons are highly conserved and have shaped protein evolution, and TF-imposed constraint appears to be a major driver of codon usage bias. Conversely, the regulatory code has been selectively depleted of TFs that recognize stop codons. More than 17% of single-nucleotide variants within duons directly alter TF binding. Pervasive dual encoding of amino acid and regulatory information appears to be a fundamental feature of genome evolution.
- SourceAvailable from: Michael Hackenberg
- "(Stergachis et al., 2013; Weatheritt and Babu, 2013 "
Article: DNA clustering and genome complexity[Show abstract] [Hide abstract]
ABSTRACT: Early global measures of genome complexity (power spectra, the analysis of fluctuations in DNA walks or compositional segmentation) uncovered a high degree of complexity in eukaryotic genome sequences. The main evolutionary mechanisms leading to increases in genome complexity (i.e. gene duplication and transposon proliferation) can all potentially produce increases in DNA clustering. To quantify such clustering and provide a genome-wide description of the formed clusters, we developed GenomeCluster, an algorithm able to detect clusters of whatever genome element identified by chromosome coordinates. We obtained a detailed description of clusters for ten categories of human genome elements, including functional (genes, exons, introns), regulatory (CpG islands, TFBSs, enhancers), variant (SNPs) and repeat (Alus, LINE1) elements, as well as DNase hypersensitivity sites. For each category, we located their clusters in the human genome, then quantifying cluster length and composition, and estimated the clustering level as the proportion of clustered genome elements. In average, we found a 27% of elements in clusters, although a considerable variation occurs among different categories. Genes form the lowest number of clusters, but these are the longest ones, both in bp and the average number of components, while the shortest clusters are formed by SNPs. Functional and regulatory elements (genes, CpG islands, TFBSs, enhancers) show the highest clustering level, as compared to DNase sites, repeats (Alus, LINE1) or SNPs. Many of the genome elements we analyzed are known to be composed of clusters of low-level entities. In addition, we found here that the clusters generated by GenomeCluster can be in turn clustered into high-level super-clusters. The observation of ‘clusters-within-clusters’ parallels the ‘domains within domains’ phenomenon previously detected through global statistical methods in eukaryotic sequences, and reveals a complex human genome landscape dominated by hierarchical clustering.Computational Biology and Chemistry 12/2014; 53. DOI:10.1016/j.compbiolchem.2014.08.011 · 1.60 Impact Factor
[Show abstract] [Hide abstract]
- "To date, the role of gene body methylation remains unclear, although intriguing correlations have been identified related to differential promoter use and alternative splicing (Maunakea et al., 2010; Shukla et al., 2011). The recent discovery of dual-use codons (duons) for transcription factor binding generates further possible regulatory roles for cytosine modifications in gene bodies, as it could impact transcription factor binding (Stergachis et al., 2013). In this regard, our BGS and TAB-BGS results from the HOXA9 locus showing elevated 5hmC focused specifically at an exon-intron junction upon depletion of DNMT3B are intriguing. "
ABSTRACT: Global patterns of DNA methylation, mediated by the DNA methyltransferases (DNMTs), are disrupted in all cancers by mechanisms that remain largely unknown, hampering their development as therapeutic targets. Combinatorial acute depletion of all DNMTs in a pluripotent human tumor cell line, followed by epigenome and transcriptome analysis, revealed DNMT functions in fine detail. DNMT3B occupancy regulates methylation during differentiation, whereas an unexpected interplay was discovered in which DNMT1 and DNMT3B antithetically regulate methylation and hydroxymethylation in gene bodies, a finding confirmed in other cell types. DNMT3B mediated non-CpG methylation, whereas DNMT3L influenced the activity of DNMT3B toward non-CpG versus CpG site methylation. Altogether, these data reveal functional targets of each DNMT, suggesting that isoform selective inhibition would be therapeutically advantageous. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
[Show abstract] [Hide abstract]
- "This phenomenon has been termed codon usage bias (CUB), and many studies support a role of natural selection in this phenomenon (Shields et al. 1988; Moriyama and Hartl 1993; Akashi et al. 1998; Comeron and Kreitman 1998; Chamary et al. 2006; Plotkin and Kudla 2011; Waldman et al. 2011; Behura et al. 2013; Kober and Pogson 2013). Proposed mechanisms influencing CUB include translational efficiency (Grantham et al. 1981; Ikemura 1985; Bulmer 1991; Carlini and Stephan 2003; Rocha 2004; Stoletzki and Eyre-Walker 2007; Parmley and Huynen 2009; Hense 2010; Ran and Higgs 2010, 2012; Sharp et al. 2010; Behura and Severson 2011; Shah and Gilchrist 2011; Qian et al. 2012; Agashe et al. 2013; Lawrie et al. 2013; Michely 2013), mRNA stability or folding (Moriyama and Powell 1998; dos Reis et al. 2004; Chamary and Hurst 2005; Chamary et al. 2006; Novoa and Ribas de Pouplana 2012; Kober and Pogson 2013; Shabalina et al. 2013), transcription factor binding (Stergachis 2013), overlap with other functional elements in the genome (Lin 2011), and/or a trade-off between rapid versus accurate translation (Yang et al. 2014). The level of CUB varies dramatically across species (Grantham et al. 1980a,b; Sharp 1988), including insects (Vicario et al. 2007), mammals (Doherty and McInerney 2013), and plants (Ingvarsson 2008, 2010). "
ABSTRACT: Synonymous codons are not used at equal frequency throughout the genome, a phenomenon termed codon usage bias (CUB). It is often assumed that interspecific variation in the intensity of CUB is related to species differences in effective population sizes (Ne), with selection on CUB operating less efficiently in species with small Ne. Here, we specifically ask whether variation in Ne predicts differences in CUB in mammals and report two main findings. First, across 41 mammalian genomes, CUB was not correlated with two indirect proxies of Ne (body mass and generation time), even though there was statistically significant evidence of selection shaping CUB across all species. Interestingly, autosomal genes showed higher codon usage bias compared to X-linked genes, and high-recombination genes showed higher codon usage bias compared to low recombination genes, suggesting intraspecific variation in Ne predicts variation in CUB. Second, across six mammalian species with genetic estimates of Ne (human, chimpanzee, rabbit, and three mouse species: Mus musculus, M. domesticus, and M. castaneus), Ne and CUB were weakly and inconsistently correlated. At least in mammals, interspecific divergence in Ne does not strongly predict variation in CUB. One hypothesis is that each species responds to a unique distribution of selection coefficients, confounding any straightforward link between Ne and CUB.Ecology and Evolution 10/2014; 4(20). DOI:10.1002/ece3.1249 · 2.32 Impact Factor