SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies

Epidemiology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA.
Nucleic Acids Research (Impact Factor: 9.11). 06/2009; 37(Web Server issue):W600-5. DOI: 10.1093/nar/gkp290
Source: PubMed

ABSTRACT We have developed a set of web-based SNP selection tools (freely available at where investigators can specify genes or linkage regions and select SNPs based on GWAS results, linkage disequilibrium (LD), and predicted functional characteristics of both coding and non-coding SNPs. The algorithm uses GWAS SNP P-value data and finds all SNPs in high LD with GWAS SNPs, so that selection is from a much larger set of SNPs than the GWAS itself. The program can also identify and choose tag SNPs for SNPs not in high LD with any GWAS SNP. We incorporate functional predictions of protein structure, gene regulation, splicing and miRNA binding, and consider whether the alternative alleles of a SNP are likely to have differential effects on function. Users can assign weights for different functional categories of SNPs to further tailor SNP selection. The program accounts for LD structure of different populations so that a GWAS study from one ethnic group can be used to choose SNPs for one or more other ethnic groups. Finally, we provide an example using prostate cancer and demonstrate that this algorithm can select a small panel of SNPs that include many of the recently validated prostate cancer SNPs.

Download full-text


Available from: Zongli Xu, Jun 25, 2015
1 Follower
  • Source
    • "The minor allele of rs9609538 is predicted to alter transcription factor (TF) binding site activity (multiple TFs including AIRE, AP-4, and CDP CR3) and miRNA binding site activity (hsa-miR-516a-5p and hsa-miR-548d-3p) based on FuncPred algorithms (Xu and Taylor 2009). SNP rs9609538 is positioned between C22orf28 (*500 bp upstream) and BPIL2 (5 bp downstream). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Epithelial ovarian cancer (EOC) is a heterogeneous cancer with both genetic and environmental risk factors. Variants influencing the risk of developing the less-common EOC subtypes have not been fully investigated. We performed a genome-wide association study (GWAS) of EOC according to subtype by pooling genomic DNA from 545 cases and 398 controls of European descent, and testing for allelic associations. We evaluated for replication 188 variants from the GWAS [56 variants for mucinous, 55 for endometrioid and clear cell, 53 for low-malignant potential (LMP) serous, and 24 for invasive serous EOC], selected using pre-defined criteria. Genotypes from 13,188 cases and 23,164 controls of European descent were used to perform unconditional logistic regression under the log-additive genetic model; odds ratios (OR) and 95 % confidence intervals are reported. Nine variants tagging six loci were associated with subtype-specific EOC risk at P < 0.05, and had an OR that agreed in direction of effect with the GWAS results. Several of these variants are in or near genes with a biological rationale for conferring EOC risk, including ZFP36L1 and RAD51B for mucinous EOC (rs17106154, OR = 1.17, P = 0.029, n = 1,483 cases), GRB10 for endometrioid and clear cell EOC (rs2190503, P = 0.014, n = 2,903 cases), and C22orf26/BPIL2 for LMP serous EOC (rs9609538, OR = 0.86, P = 0.0043, n = 892 cases). In analyses that included the 75 GWAS samples, the association between rs9609538 (OR = 0.84, P = 0.0007) and LMP serous EOC risk remained statistically significant at P < 0.0012 adjusted for multiple testing. Replication in additional samples will be important to verify these results for the less-common EOC subtypes.
    Human Genetics 11/2013; DOI:10.1007/s00439-013-1383-3 · 4.52 Impact Factor
  • Source
    • "To better understand possible relationships between genomic variation and disease related expression changes, we propose an analysis of the genetic variation at miRNA loci in a highly studied set of GWAS (WTCCC, 2007). We first use database resources to build up a subset of miRNA-associated regions in which to analyze common SNP variations as measured in GWAS (Griffiths-Jones et al., 2008; Xu and Taylor, 2009). We consider miRNA variation in SNPs within 100kb of a known miRNA or a gene sequence involved in miRNA processing. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this chapter, the authors propose an analysis of the genetic variation at microRNAs (miRNAs) loci in a highly studied set of genome-wide association study (GWAS) to explain the relationships between genomic variation and disease related expression changes. The authors use database resources to build up a subset of miRNA-associated regions in which to analyze common single nucleotide polymorphism (SNP) variations as measured in GWAS. They investigate these locations using the Wellcome Trust Case Control Consortium (WTCCC) GWAS data to determine a set of miRNA-related variations associated with common diseases that can be used to follow up in laboratory studies to better understand aspects of disease pathogenesis, progression, maintenance, and treatment.
    microRNAs in Toxicology and Medicine, Edited by Saura C. Sahu, 11/2013: chapter 19 Exploration of microRNA Genomic Variation Associated with Common Human Diseases 309: pages 309-316; Wiley., ISBN: 978-1-118-40161-3
  • Source
    • "This strong LD block contains three other non-synonymous SNPs including GABRA6 rs3811993, rs34907804, and GABRB2 rs2229945 (Supplementary Figure 1). GABRA2 rs567926 and rs279858 also share a very strong LD block with three other non-synonymous SNPs rs519972, rs41310789, and rs17852044 (rs519972: benign (Sunyaev et al, 2001; Xu and Taylor, 2009); rs41310789: exonic splicing enhancers or silencers (Xu and Taylor, 2009); and rs17852044: possibly damaging (Sunyaev et al, 2001; Xu and Taylor, 2009) based on bioinformatics prediction), and the LD structure contains only a single haplotype block (Supplementary Figure 2). Compared with the European LD plots, the Asian plots (Supplementary Figures 3 and 4) showed same results with consistent haplotype structure but stronger LD. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Gamma-Aminobutyric Acid (GABA) is a major inhibitory neurotransmitter in mammalian brain. GABA receptor subunit genes are involved in a number of complex disorders including substance abuse. No associated variants on the commonly-studied GABA receptor genes have been unequivocally identified as directly functional or pathogenic in genetic association studies of addictions. We hypothesize that meta-analysis can increase the statistical power to identify the association signals. To reconcile the conflicting associations with substance dependence traits, we performed a meta-analysis of the GABA receptor genes (GABRB2, GABRA6, GABRA1, and GABRG2 on chromosome 5q and GABRA2 on chromosome 4p12) using genotype data from 4739 cases of alcohol, heroin, opioid, or methamphetamine dependence and 4924 controls. Then, we combined these candidate gene association literature data with two additional samples with alcohol dependence (AD), including 1691 cases and 1712 controls from the Study of Addiction: Genetics and Environment (SAGE), and 2644 cases and 494 controls from our own study. We found strong associations between GABRA2 and AD (P=9 × 10(-6) and odds ratio (OR) 95% confidence interval (CI)=1.27 (1.15, 1.4) for rs567926, P=4 × 10(-5) and OR=1.21 (1.1, 1.32) for rs279858), and between GABRG2 and both dependence on alcohol and dependence on heroin (P=0.0005 and OR=1.22 (1.09, 1.37) for rs211014). Significant association was also observed between GABRA6 rs3219151 and AD. The GABRA2 rs279858 association was observed in the SAGE datasets with a combined P of 9 × 10(-6) (OR=1.17 (1.09, 1.26)). When all of these datasets, including our samples, were meta-analyzed, associations of both GABRA2 SNPs remained (for rs567926, P=7 × 10(-5) (OR=1.18 (1.09, 1.29)) in all the studies, and P=8 × 10(-6) (OR=1.25 (1.13, 1.38)) in Europeans; for rs279858, P=5 × 10(-6) (OR=1.18 (1.1, 1.26)) in Europeans). The selected threshold of Bonferroni correction for multiple comparisons was 0.007. We report here an extensive meta-analysis between the five GABA receptor candidate genes and substance abuse. Our findings suggest the involvement of the GABA receptor genes, minimally, GABRA2 in the pathogenesis of alcohol dependence. Further replications with larger samples are warranted.Neuropsychopharmacology accepted article preview online, 18 October 2013; doi:10.1038/npp.2013.291.
    Neuropsychopharmacology: official publication of the American College of Neuropsychopharmacology 10/2013; DOI:10.1038/npp.2013.291 · 7.83 Impact Factor