-
Hunter B Fraser
[show abstract]
[hide abstract]
ABSTRACT: The role of gene expression in evolutionary adaptation has been a subject of debate for over 40 years. cis-regulation of transcription has been proposed to be the primary source of morphological novelty in evolution, though this is based on only a handful of examples. Recently the first genome-wide studies of gene expression adaptation have been published, giving us an initial global view of this process. Systematic studies such as these will allow a number of key questions currently facing the field of gene expression evolution to be addressed.
BioEssays 06/2011; 33(6):469-77. · 4.95 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The idea that most morphological adaptations can be attributed to changes in the cis-regulation of gene expression levels has been gaining increasing acceptance, despite the fact that only a handful of such cases have so far been demonstrated. Moreover, because each of these cases involves only one gene, we lack any understanding of how natural selection may act on cis-regulation across entire pathways or networks. Here we apply a genome-wide test for selection on cis-regulation to two subspecies of the mouse Mus musculus. We find evidence for lineage-specific selection at over 100 genes involved in diverse processes such as growth, locomotion, and memory. These gene sets implicate candidate genes that are supported by both quantitative trait loci and a validated causality-testing framework, and they predict a number of phenotypic differences, which we confirm in all four cases tested. Our results suggest that gene expression adaptation is widespread and that these adaptations can be highly polygenic, involving cis-regulatory changes at numerous functionally related genes. These coordinated adaptations may contribute to divergence in a wide range of morphological, physiological, and behavioral phenotypes.
PLoS Genetics 03/2011; 7(3):e1002023. · 8.69 Impact Factor
-
John R Lamb,
Chunsheng Zhang,
Tao Xie,
Kai Wang,
Bin Zhang,
Ke Hao,
Eugene Chudin, Hunter B Fraser,
Joshua Millstein,
Mark Ferguson, [......],
Cliona Molony,
Valur Emilsson,
Carolyn Buser-Doepner,
Jun Zhu,
Stephen Friend,
Mao Mao,
Peter M Shaw,
Hongyue Dai,
John M Luk,
Eric E Schadt
[show abstract]
[hide abstract]
ABSTRACT: In hepatocellular carcinoma (HCC) genes predictive of survival have been found in both adjacent normal (AN) and tumor (TU) tissues. The relationships between these two sets of predictive genes and the general process of tumorigenesis and disease progression remains unclear.
Here we have investigated HCC tumorigenesis by comparing gene expression, DNA copy number variation and survival using ∼250 AN and TU samples representing, respectively, the pre-cancer state, and the result of tumorigenesis. Genes that participate in tumorigenesis were defined using a gene-gene correlation meta-analysis procedure that compared AN versus TU tissues. Genes predictive of survival in AN (AN-survival genes) were found to be enriched in the differential gene-gene correlation gene set indicating that they directly participate in the process of tumorigenesis. Additionally the AN-survival genes were mostly not predictive after tumorigenesis in TU tissue and this transition was associated with and could largely be explained by the effect of somatic DNA copy number variation (sCNV) in cis and in trans. The data was consistent with the variance of AN-survival genes being rate-limiting steps in tumorigenesis and this was confirmed using a treatment that promotes HCC tumorigenesis that selectively altered AN-survival genes and genes differentially correlated between AN and TU.
This suggests that the process of tumor evolution involves rate-limiting steps related to the background from which the tumor evolved where these were frequently predictive of clinical outcome. Additionally treatments that alter the likelihood of tumorigenesis occurring may act by altering AN-survival genes, suggesting that the process can be manipulated. Further sCNV explains a substantial fraction of tumor specific expression and may therefore be a causal driver of tumor evolution in HCC and perhaps many solid tumor types.
PLoS ONE 01/2011; 6(7):e20090. · 4.09 Impact Factor
-
Proceedings of the National Academy of Sciences 06/2010; · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Changes in gene expression have been proposed to underlie many, or even most, adaptive differences between species. Despite the increasing acceptance of this view, only a handful of cases of adaptive gene expression evolution have been demonstrated. To address this discrepancy, we introduce a simple test for lineage-specific selection on gene expression. Applying the test to genome-wide gene expression data from the budding yeast Saccharomyces cerevisiae, we find that hundreds of gene expression levels have been subject to lineage-specific selection. Comparing these findings with independent population genetic evidence of selective sweeps suggests that this lineage-specific selection has resulted in recent sweeps at over a hundred genes, most of which led to increased transcript levels. Examination of the implicated genes revealed a specific biochemical pathway--ergosterol biosynthesis--where the expression of multiple genes has been subject to selection for reduced levels. In sum, these results suggest that adaptive evolution of gene expression is common in yeast, that regulatory adaptation can occur at the level of entire pathways, and that similar genome-wide scans may be possible in other species, including humans.
Proceedings of the National Academy of Sciences 02/2010; 107(7):2977-82. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Phenotypic robustness, or canalization, has been extensively investigated both experimentally and theoretically. However, it remains unknown to what extent robustness varies between individuals, and whether factors buffering environmental variation also buffer genetic variation. Here we introduce a quantitative genetic approach to these issues, and apply this approach to data from three species. In mice, we find suggestive evidence that for hundreds of gene expression traits, robustness is polymorphic and can be genetically mapped to discrete genomic loci. Moreover, we find that the polymorphisms buffering genetic variation are distinct from those buffering environmental variation. In fact, these two classes have quite distinct mechanistic bases: environmental buffers of gene expression are predominantly sex-specific and trans-acting, whereas genetic buffers are not sex-specific and often cis-acting. Data from studies of morphological and life-history traits in plants and yeast support the distinction between polymorphisms buffering genetic and environmental variation, and further suggest that loci buffering different types of environmental variation do overlap with one another. These preliminary results suggest that naturally occurring polymorphisms affecting phenotypic robustness could be abundant, and that these polymorphisms may generally buffer either genetic or environmental variation, but not both.
PLoS ONE 01/2010; 5(1):e8635. · 4.09 Impact Factor
-
Tomas Babak,
Philip Garrett-Engele,
Christopher D Armour,
Christopher K Raymond,
Mark P Keller,
Ronghua Chen,
Carol A Rohl,
Jason M Johnson,
Alan D Attie, Hunter B Fraser,
Eric E Schadt
[show abstract]
[hide abstract]
ABSTRACT: Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application.
Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants.
Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing.
BMC Genomics 01/2010; 11:473. · 4.07 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Although protein sequences are known to evolve at vastly different rates, little is known about what determines their rate of evolution. However, a recent study using principal component regression (PCR) has concluded that evolutionary rates in yeast are primarily governed by a single determinant related to translation frequency. Here, we demonstrate that noise in biological data can confound PCRs, leading to spurious conclusions. When equalizing noise levels across 7 predictor variables used in previous studies, we find no evidence that protein evolution is dominated by a single determinant. Our results indicate that a variety of factors--including expression level, gene dispensability, and protein-protein interactions--may independently affect evolutionary rates in yeast. More accurate measurements or more sophisticated statistical techniques will be required to determine which one, if any, of these factors dominates protein evolution.
Molecular Biology and Evolution 06/2007; 24(5):1113-21. · 5.55 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Predicting the phenotypic effects of mutations is a central goal of genetics research; it has important applications in elucidating how genotype determines phenotype and in identifying human disease genes.
Using a wide range of functional genomic data from the yeast Saccharomyces cerevisiae, we show that the best predictor of a protein's knockout phenotype is the knockout phenotype of other proteins that are present in a protein complex with it. Even the addition of multiple datasets does not improve upon the predictions made from protein complex membership. Similarly, we find that a proxy for protein complexes is a powerful predictor of disease phenotypes in humans.
We propose that identifying human protein complexes containing known disease genes will be an efficient method for large-scale disease gene discovery, and that yeast may prove to be an informative model system for investigating, and even predicting, the genetic basis of both Mendelian and complex disease phenotypes.
Genome biology 02/2007; 8(11):R252. · 6.63 Impact Factor
-
Hunter B Fraser
[show abstract]
[hide abstract]
ABSTRACT: The concepts of coevolution and modularity have been studied separately for decades. Recent advances in genomics have led to the first systematic studies in each of these fields at the molecular level, resulting in several important discoveries. Both coevolution and modularity appear to be pervasive features of genomic data from all species studied to date, and their presence can be detected in many types of datasets, including genome sequences, gene expression data, and protein-protein interaction data. Moreover, the combination of these two ideas might have implications for our understanding of many aspects of biology, ranging from the general architecture of living systems to the causes of various human diseases.
Current Opinion in Genetics & Development 01/2007; 16(6):637-44. · 8.09 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest--requiring approximately 100 selected sites--but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice.
Journal of Molecular Evolution 12/2006; 63(5):635-53. · 2.27 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We recently introduced a novel method for estimating selection pressures on proteins, termed "volatility," which requires only a single genome sequence. Some criticisms that have been levied against this approach are valid, but many others are based on misconceptions of volatility, or they apply equally to comparative methods of estimating selection. Here, we introduce a simple regression technique for estimating selection pressures on all proteins in a genome, on the basis of limited comparative data. The regression technique does not depend on an underlying population-genetic mechanism. This new approach to estimating selection across a genome should be more powerful and more widely applicable than volatility itself.
Molecular Biology and Evolution 09/2006; 23(8):1457-9. · 5.55 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: It is well established that gene expression levels in many organisms change during the aging process, and the advent of DNA microarrays has allowed genome-wide patterns of transcriptional changes associated with aging to be studied in both model organisms and various human tissues. Understanding the effects of aging on gene expression in the human brain is of particular interest, because of its relation to both normal and pathological neurodegeneration. Here we show that human cerebral cortex, human cerebellum, and chimpanzee cortex each undergo different patterns of age-related gene expression alterations. In humans, many more genes undergo consistent expression changes in the cortex than in the cerebellum; in chimpanzees, many genes change expression with age in cortex, but the pattern of changes in expression bears almost no resemblance to that of human cortex. These results demonstrate the diversity of aging patterns present within the human brain, as well as how rapidly genome-wide patterns of aging can evolve between species; they may also have implications for the oxidative free radical theory of aging, and help to improve our understanding of human neurodegenerative diseases.
PLoS Biology 10/2005; 3(9):e274. · 11.45 Impact Factor
-
Hunter B Fraser
[show abstract]
[hide abstract]
ABSTRACT: Modularity, which has been found in the functional and physical protein interaction networks of many organisms, has been postulated to affect both the mode and tempo of evolution. Here I show that in the yeast Saccharomyces cerevisiae, protein interaction hubs situated in single modules are highly constrained, whereas those connecting different modules are more plastic. This pattern of change could reflect a tendency for evolutionary innovations to occur by altering the proteins and interactions between rather than within modules, in a manner somewhat similar to the evolution of new proteins through the shuffling of conserved protein domains.
Nature Genetics 05/2005; 37(4):351-2. · 35.53 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The evolutionary rates of proteins vary over several orders of magnitude. Recent work suggests that analysis of large data sets of evolutionary rates in conjunction with the results from high-throughput functional genomic experiments can identify the factors that cause proteins to evolve at such dramatically different rates. To this end, we estimated the evolutionary rates of >3,000 proteins in four species of the yeast genus Saccharomyces and investigated their relationship with levels of expression and protein dispensability. Each protein's dispensability was estimated by the growth rate of mutants deficient for the protein. Our analyses of these improved evolutionary and functional genomic data sets yield three main results. First, dispensability and expression have independent, significant effects on the rate of protein evolution. Second, measurements of expression levels in the laboratory can be used to filter data sets of dispensability estimates, removing variates that are unlikely to reflect real biological effects. Third, structural equation models show that although we may reasonably infer that dispensability and expression have significant effects on protein evolutionary rate, we cannot yet accurately estimate the relative strengths of these effects.
Proceedings of the National Academy of Sciences 04/2005; 102(15):5483-8. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Evolution at silent sites is often used to estimate the pace of selectively neutral processes or to infer differences in divergence times of genes. However, silent sites are subject to selection in favor of preferred codons, and the strength of such selection varies dramatically across genes. Here, we use the relationship between codon bias and synonymous divergence observed in four species of the genus Saccharomyces to provide a simple correction for selection on silent sites.
Molecular Biology and Evolution 02/2005; 22(1):174-7. · 5.55 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Relatively little is known about the mechanisms through which gene expression regulation evolves. To investigate this, we systematically explored the conservation of regulatory networks in fungi by examining the cis-regulatory elements that govern the expression of coregulated genes. We first identified groups of coregulated Saccharomyces cerevisiae genes enriched for genes with known upstream or downstream cis-regulatory sequences. Reasoning that many of these gene groups are coregulated in related species as well, we performed similar analyses on orthologs of coregulated S. cerevisiae genes in 13 other ascomycete species. We find that many species-specific gene groups are enriched for the same flanking regulatory sequences as those found in the orthologous gene groups fromS. cerevisiae, indicating that those regulatory systems have been conserved in multiple ascomycete species. In addition to these clear cases of regulatory conservation, we find examples of cis-element evolution that suggest multiple modes of regulatory diversification, including alterations in transcription factor-binding specificity, incorporation of new gene targets into an existing regulatory system, and cooption of regulatory systems to control a different set of genes. We investigated one example in greater detail by measuring the in vitro activity of the S. cerevisiae transcription factor Rpn4p and its orthologs from Candida albicans and Neurospora crassa. Our results suggest that the DNA binding specificity of these proteins has coevolved with the sequences found upstream of the Rpn4p target genes and suggest that Rpn4p has a different function in N. crassa.
PLoS Biology 01/2005; 2(12):e398. · 11.45 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed `volatility', to estimate selection pressures on protein sequences from their synonymous codon usage (Plotkin and Dushoff 2003, Plotkin et al 2004a). Here we provide a theoretical foundation for this approach. We derive the expected frequencies of synonymous codons as a function of the strength of selection, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. Our results indicate that, over a broad range of parameters, synonymous codon usage can reliably distinguish between negative selection, positive selection, and neutrality. While the power of volatility to detect negative selection depends on the population size, there is no such dependence for the detection of positive selection. Furthermore, we show that phenomena such as transient hyper-mutators in microbes can improve the power of volatility to detect negative selection, even when the typical observed neutral site heterozygosity is low.
11/2004;
-
[show abstract]
[hide abstract]
ABSTRACT: Physically interacting proteins or parts of proteins are expected to evolve in a coordinated manner that preserves proper interactions. Such coevolution at the amino acid-sequence level is well documented and has been used to predict interacting proteins, domains, and amino acids. Interacting proteins are also often precisely coexpressed with one another, presumably to maintain proper stoichiometry among interacting components. Here, we show that the expression levels of physically interacting proteins coevolve. We estimate average expression levels of genes from four closely related fungi of the genus Saccharomyces using the codon adaptation index and show that expression levels of interacting proteins exhibit coordinated changes in these different species. We find that this coevolution of expression is a more powerful predictor of physical interaction than is coevolution of amino acid sequence. These results demonstrate that gene expression levels can coevolve, adding another dimension to the study of the coevolution of interacting proteins and underscoring the importance of maintaining coexpression of interacting proteins over evolutionary time. Our results also suggest that expression coevolution can be used for computational prediction of protein-protein interactions.
Proceedings of the National Academy of Sciences 07/2004; 101(24):9033-8. · 9.68 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: All organisms have elaborate mechanisms to control rates of protein production. However, protein production is also subject to stochastic fluctuations, or "noise." Several recent studies in Saccharomyces cerevisiae and Escherichia coli have investigated the relationship between transcription and translation rates and stochastic fluctuations in protein levels, or more generally, how such randomness is a function of intrinsic and extrinsic factors. However, the fundamental question of whether stochasticity in protein expression is generally biologically relevant has not been addressed, and it remains unknown whether random noise in the protein production rate of most genes significantly affects the fitness of any organism. We propose that organisms should be particularly sensitive to variation in the protein levels of two classes of genes: genes whose deletion is lethal to the organism and genes that encode subunits of multiprotein complexes. Using an experimentally verified model of stochastic gene expression in S. cerevisiae, we estimate the noise in protein production for nearly every yeast gene, and confirm our prediction that the production of essential and complex-forming proteins involves lower levels of noise than does the production of most other genes. Our results support the hypothesis that noise in gene expression is a biologically important variable, is generally detrimental to organismal fitness, and is subject to natural selection.
PLoS Biology 07/2004; 2(6):e137. · 11.45 Impact Factor