chromosomal regions that appear to be linked to disease.
GWAS oﬀers an agnostic approach to investigating
SNP-disease association, and the results of such studies
oﬀers a wealth of data to inform the next generation of
investigation. Here, we develop a user-friendly web server
to incorporate such clinical, experimental, mechanistic,
and computational information with the results of
GWAS in order to organize, annotate, and select SNPs.
The web server can be used for either small or large-scale
SNP selection and is particularly useful for association
studies. It uses both functional prediction and GWAS
results to select not only SNPs included in the GWAS,
but other functional SNPs in dbSNP that were not in
the GWAS. Considering the varied interests and emphasis
diﬀerent investigators may bring to a problem, we pro-
vided many tunable parameters in each web utility, so
the algorithm can be adjusted to meet diﬀerent needs.
We employ several methods for functional sequence
assessment and predict functional consequence of diﬀerent
alleles of a SNP. To reduce the number of false positive
results, we perform the predictions in only the most prob-
able genomic regions for each category of functional
sequence site (such as the gene promoter region for TFBS
or the 3
UTR for microRNA-binding sites) and use phy-
logenetic footprint information to ﬁlter out non-conserved
putative functional sequence. The SNP selection algorithm
uses functional prediction results to prioritize LD tag SNP
selection. These LD tag SNPs capture other unexamined
SNPs in and around a gene, including SNPs with unknown
or unpredicted functional consequences. The web utility
options allow an investigator to choose prediction methods
and assign weights to those predictions for study-speciﬁc
SNP selection. Functional sequence prediction is a rapidly
developing ﬁeld. The web server structure allows rapid
updates as better methods of functional prediction
become available and it allows expansion to include pre-
dictions on other biologic functions.
Supplementary Data are available at NAR Online.
This research was supported by the Intramural Research
Program of the NIH, National Institute of Environmental
Health Sciences. Funding for open access charge:
National Institute of Environmental Health Sciences.
Conﬂict of interest statement. None declared.
1. The International HapMap Consortium, (2007) A second genera-
tion human haplotype map of over 3.1 million SNPs. Nature, 449,
2. Altshuler,D., Daly,M.J. and Lander,E.S. (2008) Genetic mapping in
human disease. Science, 322, 881–888.
3. Kruglyak,L. (2008) The road to genome-wide association studies.
Nat. Rev. Genet., 9, 314–318.
4. Yeager,M., Orr,N., Hayes,R.B., Jacobs,K.B., Kraft,P.,
Wacholder,S., Minichiello,M.J., Fearnhead,P., Yu,K., Chatterjee,N.
et al. (2007) Genome-wide association study of prostate cancer
identiﬁes a second risk locus at 8q24. Nat. Genet., 39, 645–649.
5. Thomas,G., Jacobs,K.B., Yeager,M., Kraft,P., Wacholder,S.,
Orr,N., Yu,K., Chatterjee,N., Welch,R., Hutchinson,A. et al. (2008)
Multiple loci identiﬁed in a genome-wide association study of
prostate cancer. Nat. Genet., 40, 310–315.
6. Cargill,M., Altshuler,D., Ireland,J., Sklar,P., Ardlie,K., Patil,N.,
Shaw,N., Lane,C.R., Lim,E.P., Kalyanaraman,N. et al. (1999)
Characterization of single-nucleotide polymorphisms in coding
regions of human genes. Nat. Genet., 22, 231–238. [erratum appears
in Nat. Genet. (1999), 23, 73].
7. de Bakker,P.I., Yelensky,R., Pe’er,I., Gabriel,S.B., Daly,M.J. and
Altshuler,D. (2005) Eﬃciency and power in genetic association
studies. [see comment]. Nat. Genet., 37, 1217–1223.
8. Sunyaev,S., Ramensky,V., Koch,I., Lathe,W., Kondrashov,A.S. 3rd
and Bork,P. (2001) Prediction of deleterious human alleles. Human
Mol. Genet., 10, 591–597.
9. Yue,P., Melamud,E. and Moult,J. (2006) SNPs3D: candidate gene
and SNP selection for association studies. BMC Bioinformat., 7,
10. Kel,A.E., Gossling,E., Reuter,I., Cheremushkin,E.,
Kel-Margoulis,O.V. and Wingender,E. (2003) MATCH: a tool for
searching transcription factor binding sites in DNA sequences.
Nucleic Acids Res., 31, 3576–3579.
11. Matys,V., Kel-Margoulis,O.V., Fricke,E., Liebich,I., Land,S.,
Barre-Dirrie,A., Reuter,I., Chekmenev,D., Krull,M., Hornischer,K.
et al. (2006) TRANSFAC and its module TRANSCompel:
transcriptional gene regulation in eukaryotes. Nucleic Acids Res.,
12. Elnitski,L., Hardison,R.C., Li,J., Yang,S., Kolbe,D., Eswara,P.,
O’Connor,M.J., Schwartz,S., Miller,W. and Chiaromonte,F. (2003)
Distinguishing regulatory DNA from neutral sites. Genome Res., 13,
13. King,D.C., Taylor,J., Elnitski,L., Chiaromonte,F., Miller,W. and
Hardison,R.C. (2005) Evaluation of regulatory potential and
conservation scores for detecting cis-regulatory modules in aligned
mammalian genome sequences. Genome Res., 15, 1051–1060.
14. Elnitski,L., Jin,V.X., Farnham,P.J. and Jones,S.J. (2006) Locating
mammalian transcription factor binding sites: a survey of com-
putational and experimental techniques.
Genome Res. , 16,
Yuan,H.Y., Chiou,J.J., Tseng,W.H., Liu,C.H., Liu,C.K., Lin,Y.J.,
Wang,H.H., Yao,A., Chen,Y.T. and Hsu,C.N. (2006) FASTSNP:
an always up-to-date and extendable service for SNP function
analysis and prioritization. Nucleic Acids Res., 34, W635–641.
16. Fairbrother,W.G., Holste,D., Burge,C.B. and Sharp,P.A. (2004)
Single nucleotide polymorphism-based validation of exonic splicing
enhancers. PLos Biol. , 2, E268.
17. Graveley,B.R., Hertel,K.J. and Maniatis,T. (1998) A systematic
analysis of the factors that determine the strength of pre-mRNA
splicing enhancers. EMBO J., 17, 6747–6756.
18. Xiao,X., Wang,Z., Jang,M. and Burge,C.B. (2007) Coevolutionary
networks of splicing cis-regulatory elements. Proc. Natl Acad.
Sci.USA, 104, 18583–18588.
19. Fairbrother,W.G., Yeo,G.W., Yeh,R., Goldstein,P., Mawson,M.,
Sharp,P.A. and Burge,C.B. (2004) RESCUE-ESE identiﬁes candi-
date exonic splicing enhancers in vertebrate exons. Nucleic Acids
Res., 32 , W187–W190.
20. Cartegni,L., Wang,J., Zhu,Z., Zhang,M.Q. and Krainer,A.R. (2003)
ESEﬁnder: a web resource to identify exonic splicing enhancers.
Nucleic Acids Res., 31, 3568–3571.
21. Wang,Z., Rolish,M.E., Yeo,G., Tung,V., Mawson,M. and
Burge,C.B. (2004) Systematic identiﬁcation and analysis of exonic
splicing silencers. [see comment]. Cell, 119, 831–845.
22. John,B., Enright,A.J., Aravin,A., Tuschl,T., Sander,C. and
Marks,D.S. (2004) Human MicroRNA targets. PLos Biol., 2, e363
[erratum appears in PLoS Biol. (2005), 3, e264].
23. Griﬃths-Jones,S., Saini,H.K., van Dongen,S. and Enright,A.J.
(2008) miRBase: tools for microRNA genomics. Nucleic Acids Res.,
24. Xu,Z., Kaplan,N.L. and Taylor,J.A. (2007) TAGster: eﬃcient
selection of LD tag SNPs in single or multiple populations.
Bioinformatics, 23, 3254–3255.
Nucleic Acids Research, 2009, Vol. 37, Web Server issue W605
at NIH Library on June 25, 2015http://nar.oxfordjournals.org/Downloaded from