Assessing the Conservation of Mammalian Gene Expression Using High-Density Exon Arrays
Microarray data from multiple species have been used to study evolutionary constraints on gene expression. Expression measurements from conventional microarray platforms such as the 3' expression arrays are strongly affected by platform-dependent probe effects that may introduce apparent but misleading discrepancies between species. In this manuscript, we assess the conservation of mammalian gene expression in adult tissues using data from a high-density exon array platform. The exon arrays have more than 6 million probes on a single array targeting all exons in a genome. We find that, unlike 3' array data, gene expression measurements from exon arrays reveal patterns of gene expression that are highly conserved between humans and mice in multiple tissues. Our analysis provides strong evidence for widespread stabilizing selection pressure on transcript abundance during mammalian evolution.
Available from: Carole Leavel Bassett
- "It has been reported that duplicated genes rarely diverge with respect to their biochemical function, but instead are limited to alterations in regulatory control . So the different expression profiles between duplicated genes may be caused by varied regulatory network or mutations in the cis-regulatory regions , or mutations affecting the related regulatory network [60,61]. Epigenetic mechanisms, such as DNA methylation have also been suggested to potentially contribute to the expression divergence of duplicated genes [62,63], where transcriptional silencing has often been associated with DNA methylation in promoter regions [64,65]. "
[Show abstract] [Hide abstract]
ABSTRACT: Aspartic proteases (APs) are a large family of proteolytic enzymes found in almost all organisms. In plants, they are involved in many biological processes, such as senescence, stress responses, programmed cell death, and reproduction. Prior to the present study, no grape AP gene(s) had been reported, and their research on woody species was very limited.
In this study, a total of 50 AP genes (VvAP) were identified in the grape genome, among which 30 contained the complete ASP domain. Synteny analysis within grape indicated that segmental and tandem duplication events contributed to the expansion of the grape AP family. Additional analysis between grape and Arabidopsis demonstrated that several grape AP genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of grape and Arabidopsis. Phylogenetic relationships of the 30 VvAPs with the complete ASP domain and their Arabidopsis orthologs, as well as their gene and protein features were analyzed and their cellular localization was predicted. Moreover, expression profiles of VvAP genes in six different tissues were determined, and their transcript abundance under various stresses and hormone treatments were measured. Twenty-seven VvAP genes were expressed in at least one of the six tissues examined; nineteen VvAPs responded to at least one abiotic stress, 12 VvAPs responded to powdery mildew infection, and most of the VvAPs responded to SA and ABA treatments. Furthermore, integrated synteny and phylogenetic analysis identified orthologous AP genes between grape and Arabidopsis, providing a unique starting point for investigating the function of grape AP genes.
The genome-wide identification, evolutionary and expression analyses of grape AP genes provide a framework for future analysis of AP genes in defining their roles during stress response. Integrated synteny and phylogenetic analyses provide novel insight into the functions of less well-studied genes using information from their better understood orthologs.
BMC Genomics 08/2013; 14(1):554. DOI:10.1186/1471-2164-14-554 · 3.99 Impact Factor
Available from: Sven Bergmann
- "The two most common measures of similarity between expression profiles of orthologous genes are Pearson's correlation coefficient (Chan et al., 2009; Liao and Zhang, 2006a, b; Xing et al., 2007; Yanai et al., 2004; Yang et al., 2005; Zheng-Bradley et al., 2010) and Euclidean distance (Jordan et al., 2005; Liao and Zhang, 2006a; Yanai et al., 2004). The results obtained with Pearson's and Euclidean distances have been reported to be poorly correlated (Liao and Zhang, 2006a; Pereira et al., 2009). "
[Show abstract] [Hide abstract]
ABSTRACT: Motivation: Comparative analyses of gene expression data from different species have become an important component of the study of molecular evolution. Thus methods are needed to estimate evolutionary distances between expression profiles, as well as a neutral reference to estimate selective pressure. Divergence between expression profiles of homologous genes is often calculated with Pearson's or Euclidean distance. Neutral divergence is usually inferred from randomized data. Despite being widely used, neither of these two steps has been well studied. Here, we analyze these methods formally and on real data, highlight their limitations and propose improvements.
Results: It has been demonstrated that Pearson's distance, in contrast to Euclidean distance, leads to underestimation of the expression similarity between homologous genes with a conserved uniform pattern of expression. Here, we first extend this study to genes with conserved, but specific pattern of expression. Surprisingly, we find that both Pearson's and Euclidean distances used as a measure of expression similarity between genes depend on the expression specificity of those genes. We also show that the Euclidean distance depends strongly on data normalization. Next, we show that the randomization procedure that is widely used to estimate the rate of neutral evolution is biased when broadly expressed genes are abundant in the data. To overcome this problem, we propose a novel randomization procedure that is unbiased with respect to expression profiles present in the datasets. Applying our method to the mouse and human gene expression data suggests significant gene expression conservation between these species.
Supplementary data are available at Bioinformatics online.
Bioinformatics 05/2012; 28(14):1865-72. DOI:10.1093/bioinformatics/bts266 · 4.98 Impact Factor
Available from: Carina Farah Mugal
- "Three repeated measurements of gene expression values, denoted as expression indices, were available. Following Xing et al. , we took the logarithm of the expression indices to gain a measure approximately linearly proportional to transcription levels in germ cells. Subsequently, mean values were computed for each set of repeated measurements. "
[Show abstract] [Hide abstract]
ABSTRACT: A major goal in the study of molecular evolution is to unravel the mechanisms that induce variation in the germ line mutation rate and in the genome-wide mutation profile. The rate of germ line mutation is considerably higher for cytosines at CpG sites than for any other nucleotide in the human genome, an increase commonly attributed to cytosine methylation at CpG sites. The CpG mutation rate, however, is not uniform across the genome and, as methylation levels have recently been shown to vary throughout the genome, it has been hypothesized that methylation status may govern variation in the rate of CpG mutation.
Here, we use genome-wide methylation data from human sperm cells to investigate the impact of DNA methylation on the CpG substitution rate in introns of human genes. We find that there is a significant correlation between the extent of methylation and the substitution rate at CpG sites. Further, we show that the CpG substitution rate is positively correlated with non-CpG divergence, suggesting susceptibility to factors responsible for the general mutation rate in the genome, and negatively correlated with GC content. We only observe a minor contribution of gene expression level, while recombination rate appears to have no significant effect.
Our study provides the first direct empirical support for the hypothesis that variation in the level of germ line methylation contributes to substitution rate variation at CpG sites. Moreover, we show that other genomic features also impact on CpG substitution rate variation.
Genome biology 06/2011; 12(6):R58. DOI:10.1186/gb-2011-12-6-r58 · 10.81 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.