Single nucleotide polymorphisms affect both cis- and trans-eQTLs

Department of Biostatistics, Section on Statistical Genetics, School of Public Health, University of Alabama at Birmingham, AL 35209, USA.
Genomics (Impact Factor: 2.28). 03/2009; 93(6):501-8. DOI: 10.1016/j.ygeno.2009.01.011
Source: PubMed


Single nucleotide polymorphisms (SNPs) between microarray probes and RNA targets can affect the performance of expression array by weakening the hybridization. In this paper, we examined the effect of the SNPs on Affymetrix GeneChip probe set summaries and the expression quantitative trait loci (eQTL) mapping results in two eQTL datasets, one from mouse and one from human. We showed that removing SNP-containing probes significantly changed the probe set summaries and the more SNP-containing probes we removed the greater the change. Comparison of the eQTL mapping results between with and without SNP-containing probes showed that less than 70% of the significant eQTL peaks were concordant regardless of the significance threshold. These results indicate that SNPs do affect both probe set summaries and eQTLs (both cis and trans), thus SNP-containing probes should be filtered out to improve the performance of eQTL mapping.

Download full-text


Available from: Rui Feng, Mar 27, 2014
  • Source
    • "We downloaded the whole-adult Affymetrix Drosophila 2.0 array expression data (accession number E-MEXP-1594) reported in Ayroles et al. (2009) and the original 37 sequences of the DGRP first released in 2009 (Mackay et al. 2012). Probes with underlying SNPs were removed or masked (Benovoy et al. 2008; Chen et al. 2009). The sex effect for each gene was removed and the residuals rescaled to standardized deviates using the total sample variance. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we couple the geographic variation in 127 SNP frequencies in genes of 46 enzymes of central metabolism with their associated cis-expression variation to predict latitudinal or climatic driven gene expression changes in the metabolic architecture of Drosophila melanogaster. Forty-two percent of the SNPs in 65% of the genes show statistically significant clines in frequency with latitude across the 20 local population samples collected from southern Florida to Ontario. A number of SNPs in the screened genes are also associated with significant expression variation within the Raleigh population from North Carolina. A principal component analysis of the full variance-covariance matrix of latitudinal changes in SNP-associated standardized gene expression allows us to identify those major genes in the pathway and its associated branches that are likely targets of natural selection. When embedded in a central metabolic context, we show that these apparent targets are concentrated in the genes of the upper glycolytic pathway and pentose shunt, those controlling glycerol shuttle activity and finally those enzymes associated with the utilization of glutamate and pyruvate. These metabolites possess high connectivity and thus may be the points where flux balance can be best shifted. We also propose that these points are conserved points associated with coupling energy homeostasis and energy-sensing in mammals. We speculate that the modulation of gene expression at specific points in central metabolism that are associated with shifting flux balance or possibly energy-state sensing plays a role in adaptation to climatic variation.
    Full-text · Article · Apr 2014 · Molecular Biology and Evolution
  • Source
    • "Transcript abundance may act as intermediate phenotype between loci and macroscopic phenotypes, and can be considered as expression quantitative trait (e-trait) in order to identify chromosomal regions where genotypes significantly affect gene expression [120]. By using cis- and trans- mapping approaches, other interesting questions regarding gene expression regulation could be addressed by combining QTL and eQTL: for instance the relative contributions of cis-regulatory elements versus trans-regulatory elements [121], or the exploration of the effect of gene duplications on the genetic regulatory network [122]. Because of the virtually unlimited types of data that can be integrated in QTL mapping for an "overall genomic information system" (e.g., eQTL, proteomics, metabolomics, association studies), the increase of gene mapping efforts in conifer species shall represent an important stage for conifer comparative genomics, simultaneously opening stimulating perspectives for evolutionary studies and molecular breeding applications. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers.For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies.Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and association genetic studies of adaptation and growth in Picea taxa. The putative QTNs identified will be tested for associations in natural populations, with potential applications in molecular breeding and gene conservation programs. QTLs mapping consistently across years and environments could also be the most important targets for breeding, because they represent genomic regions that may be least affected by G × E interactions.
    Full-text · Article · Mar 2011 · BMC Genomics
  • Source
    • "It would be important to know how many of these cisacting candidate QTL are genuine and how many are caused by qualitative differences between isoforms. As has been established for several years, SNP position and the number of SNPs overlapping the microarray probes have a significant impact on the discovery rate and the size of apparent expression differences for cis-(Schadt et al. 2003; Alberts et al., 2005, 2007; Doss et al. 2005; Chen et al. 2009) and trans-modulated transcripts (Chen et al. 2009). Our analysis found that the longer Illumina 50-mers have approximately the same sensitivity to sequence differences as the shorter Affymetrix 25-mer probes (Figure S1 and Figure S2). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Common sequence variants within a gene often generate important differences in expression of corresponding mRNAs. This high level of local (allelic) control-or cis modulation-rivals that produced by gene targeting, but expression is titrated finely over a range of levels. We are interested in exploiting this allelic variation to study gene function and downstream consequences of differences in expression dosage. We have used several bioinformatics and molecular approaches to estimate error rates in the discovery of cis modulation and to analyze some of the biological and technical confounds that contribute to the variation in gene expression profiling. Our analysis of SNPs and alternative transcripts, combined with eQTL maps and selective gene resequencing, revealed that between 17 and 25% of apparent cis modulation is caused by SNPs that overlap probes rather than by genuine quantitative differences in mRNA levels. This estimate climbs to 40-50% when qualitative differences between isoform variants are included. We have developed an analytical approach to filter differences in expression and improve the yield of genuine cis-modulated transcripts to approximately 80%. This improvement is important because the resulting variation can be successfully used to study downstream consequences of altered expression on higher-order phenotypes. Using a systems genetics approach we show that two validated cis-modulated genes, Stk25 and Rasd2, are likely to control expression of downstream targets and affect disease susceptibility.
    Full-text · Article · Nov 2009 · Genetics
Show more