Large-Scale Evaluation of Genetic Variants in Candidate
Genes for Colorectal Cancer Risk in the Nurses’ Health
Study and the Health Professionals’ Follow-up Study
Aditi Hazra,1Stephen Chanock,5,6Edward Giovannucci,1,2,3David G. Cox,1,3Tianhua Niu,1
Charles Fuchs,3,4Walter C. Willett,1,2,3and David J. Hunter1,2,3
1Program in Molecular and Genetic Epidemiology, Department of Epidemiology, and
3Channing Laboratory, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School;
Dana-Farber Cancer Institute, Boston, Massachusetts; and
at the Advanced Technology Centre, National Cancer Institute, Department of Health and Human Services, Bethesda, Maryland
2Department of Nutrition, Harvard School of Public Health;
4Department of Medical Oncology,
5Division of Cancer Epidemiology and Genetics and
6Core Genotype Facility
Advances in genomics offer new strategies for assessing
the association of common genetic variations at multi-
ple loci and risk of many diseases, including colorectal
cancer. Low-penetrance alleles of genes in many
biological pathways, such as DNA repair, metabolism,
inflammation, cell cycle, apoptosis, and Wnt signaling,
may influence the risk of nonfamilial colorectal cancer.
To identify susceptibility genes for colorectal cancer, we
designed a large-scale case-control association study
nested within the Nurses’ Health Study (190 cases and
190 controls) and the Health Professionals’ Follow-up
Study (168 cases and 168 controls). We used a custom
GoldenGate (Illumina) oligonucleotide pool assay
including 1,536 single nucleotide polymorphisms
(SNP) selected in candidate genes from cancer-related
pathways, which have been sequenced and genotyped
in the SNP500Cancer project; 1,412 of the 1,536 (92%) of
the SNPs were genotyped successfully within 388
genes. SNPs in high linkage disequilibrium (r2z 0.90)
with another assayed SNP were excluded from further
analyses. As expected by chance (and not significant
compared with a corrected Bonferroni P = 0.00004),
in the additive model, 11 of 1,253 (0.9%) SNPs had
a Ptrend< 0.01 and 38 of 1,253 (3.0%) SNPs had a Ptrendz
0.01 and Ptrend< 0.05. Of note, the MGMT Lys178Arg
(rs2308237) SNP, in linkage disequilibrium with the
previously reported MGMT Ile143Val SNP, had an
inverse association with colorectal cancer risk (MGMT
Lys178Arg: odds ratio, 0.52; 95% confidence interval,
0.35-0.78; unadjusted Ptrend= 0.0003 for the additive
model; gene-based test global P = 0.00003). The
SNP500Cancer database and the Illumina GoldenGate
Assay allowed us to test a larger number of SNPs than
previously possible. We identified several SNPs wor-
thy of investigation in larger studies.
miol Biomarkers Prev 2008;17(2):311–9)
Colorectal cancer, a complex disease arising from both
genetic and environmental factors, is the third most
common cancer and the second most common cause of
death due to cancer in the United States (1).
So far, susceptibility to colorectal cancer has been
characterized by the identification of rare inherited
mutations in a small number of established genes and
by diet and lifestyle factors, including intake of red meat
and alcohol as well as smoking, physical activity, and
obesity. High-penetrance mutations in the APC/WNT
pathway (in the APC, AXIN, and CTNNB1 genes) and
mismatch repair pathway (in the MLH1, MSH1, MSH2,
MSH3, PMS1, and PMS2 genes) are found in a
proportion of cases of familial colorectal cancer, but
these alterations account for only a small fraction of the
risk of colorectal cancer in the general population.
However, genetic variation in common, low-penetrance
genes in multiple biological pathways, such as DNA
repair, metabolism, inflammation, cell cycle, apoptosis,
and WNT signaling, may also contribute to the etiology
of inherited and sporadic cases of colorectal cancer. The
spectrum of allelic differences in low-penetrance genes
could account for the interindividual variation in
response to diet and lifestyle factors.
The advent of highly annotated single nucleotide
polymorphism (SNP) data from the International Hap-
Map Project, coupled with the development of high-
throughput genotyping platforms, has made SNPs
attractive markers for large-scale association studies in
candidate genes (2-5). The candidate gene approach may
offer insight valuable for detecting genetic associations
with colorectal cancer. However, most published studies
have been limited to a single or a small number of SNPs
in a small number of genes. Therefore, we conducted a
prospective nested case-control study in the Nurses’
Health Study (NHS) and the Health Professionals’
Cancer Epidemiol Biomarkers Prev 2008;17(2). February 2008
Received 3/2/07; revised 10/5/07; accepted 11/21/07.
Grant support: NIH research grants CA70817, CA87969, and CA55075, Entertainment
Industry Foundation National Colorectal Cancer Research Alliance, and NIH training
grant T-32 CA 09001-30 (A. Hazra).
The costs of publication of this article were defrayed in part by the payment of page
charges. This article must therefore be hereby marked advertisement in accordance
with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Epidemiology,
Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).
Requests for reprints: Aditi Hazra, Channing Laboratory, Department of Medicine,
Brigham and Women’s Hospital and Harvard Medical School, 181 Longwood
Avenue, Boston, MA 02115. Phone: 617-525-2035; Fax: 617-525-2008.
Copyright D 2008 American Association for Cancer Research.
Follow-up Study (HPFS) to evaluate common sequence
variants at 1,536 loci in 388 genes to assess associations
with susceptibility to colorectal cancer.
Materials and Methods
Study Population. The NHS is an ongoing prospec-
tive study of 121,700 U.S. female registered nurses.
Details of the design and follow-up of this cohort have
been described previously (9, 10). Briefly, at enrollment
in 1976, the participants, ages 30 to 55 years, completed
a questionnaire providing information on risk factors for
cancer and cardiovascular disease. Exposure and disease
information are updated biennially. From 1989 to 1990,
blood samples were collected from 32,826 of the NHS
participants. After blood collection through June 2000,
197 incident cases of colorectal cancer were confirmed
through medical records or death reports, of which 190
cases were successfully genotyped. Controls were
randomly selected from women who were alive and
free of cancer at the time of case ascertainment. One
control was matched to each case on year of birth and
month of blood draw.
The HPFS began in 1986 when 51,529 U.S. male
dentists, optometrists, osteopaths, podiatrists, pharma-
cists, and veterinarians, ages 40 to 75 years, responded
to a mailed questionnaire (9). These men provided
baseline information on age, marital status, height,
weight, ancestry, medications, smoking history, medical
history, physical activity, and diet. Exposure and
medical history information are updated every 2 years.
Blood samples were collected from 18,225 of the HPFS
participants between 1993 and 1995. Among these men,
168 incident cases of colorectal cancer were identified
between the date of blood draw and January 2002. Men
who were alive and free of diagnosed cancer at the
time of case ascertainment were selected as controls
and were matched to cases on year of birth and month
of blood draw.
Sample Collection. Venous blood samples were
separated into plasma, buffy coat, and RBCs and stored
in liquid nitrogen. Genomic DNA was extracted from
50 AL buffy coat diluted with 150 AL PBS using the
QIAmp (Qiagen) 96-spin blood protocol according to the
manufacturer’s instructions. Concentrations of genomic
DNA were measured in 96-well format with PicoGreen
technology (Molecular Probes).
Gene and SNP Selection. The SNP500Cancer
database (http://snp500cancer.nci.nih.gov), a compo-
nent of the National Cancer Institute’s Cancer Genome
Anatomy Project, provides sequence and genotype assay
information for candidate SNPs in genes hypothesized
to be related to cancer. National Cancer Institute’s
SNP500Cancer reports sequence analysis and allele
prevalence information in anonymized control DNA
samples (n = 102 Coriell samples representing four self-
described ethnic groups: African/African American,
Caucasian, Hispanic, and Pacific Rim; refs. 2, 4).
We designed an Illumina oligonucleotide pool assay
(OPA) selecting candidate genes that were resequenced
in the SNP500Cancer project (4). As of November 2004,
there were a total of 5,800 SNPs with a minor allele
frequency (MAF) of >3% in the combined four self-
described populations (African/African American,
Caucasian, Hispanic, and Pacific Rim) annotated in
the SNP500 database (2). Overall, the SNP selection
approach for this study was to examine 10 kb
upstream and 10 kb downstream in accordance with
design score validations based on Illumina in-house
measurements and the 60-bp limitation (a SNP cannot
be closer than 60 bp to another SNP on this OPA).
After excluding SNPs with high r2(defined as r2z
0.8), we did a preliminary screen of the remaining
3,072 SNPs in the SNP500Cancer database (2-8, 11).
From the 3,072 SNP panel, the final SNP selection of
the 1,536 polymorphisms chosen for the OPA included
SNPs with a MAF of >3% in the unrelated (that is,
parents) HapMap CEPH Utah (CEU, with European
Table 1. Select characteristics of the study population
Cases (n = 168) Controls (n = 168)Cases (n = 197) Controls (n = 197)
Mean age at diagnosis or selection (y)
Family history of colorectal cancer (%)
Never smokers (%)
Mean pack-years of smokingc
Regular aspirin use (%)
Current use of postmenopausal hormone (%)b
Multivitamin use (%)
Mean body mass index, kg/m2
Mean physical activity, MET-h/wk
Red meat intake z1 serving/d (%)
Mean dietary folate, Ag/dx
Mean alcohol intake, g/dayx
Colon cancer by site
*P value for the combined NHS and HPFS cases and controls.
cPack-years of smoking were calculated among past and current smokers only.
bPercentage of women using hormone replacement therapy was calculated among postmenopausal women only.
xMean values calculated at baseline.
SNP500Cancer SNPs and Colorectal Cancer Risk
Cancer Epidemiol Biomarkers Prev 2008;17(2). February 2008
interactions. Although the SNPs were selected among
candidate genes, we did not use a haplotype-tagging
approach. Therefore, the association with colorectal
cancer risk is not definitive for many genes, for which
only one to two SNPs were included in this study.
Nevertheless, there were 145 nonsynonymous SNPs in
this analysis, of which two loci were strongly associated
with colorectal cancer risk. In addition, the low MAF of
several SNPs may make accurate detection of a modest
association with colorectal cancer risk difficult. Further,
we recognize that susceptibility to colorectal cancer is
influenced by genetic epistasis and determined by
synergistic interactions between environmental carcino-
gens and allelic variants of multiple genes in numerous
pathways. Last, the functional relevance of many of the
polymorphisms examined in this study is unknown.
In summary, these data extend the current knowledge
of genetic variation associated with colorectal cancer risk.
SNP-based and gene-based approaches suggest that
genetic variants in MGMT may be associated with
colorectal cancer risk. Our study lends further support
to the previously reported association of the MGMT
Ile143Val, in linkage with the MGMT Lys178Arg SNP
genotyped in this OPA, located near the 145Cys residue,
with risk of colorectal cancer. In addition, we identified
novel polymorphisms, including nonsynonymous SNPs
SAT Arg126Cys and COMT Val158Met, associated with
colorectal cancer risk. The PMS2 S260S SNP is in a gene
in the mismatch repair pathway, a major pathway with
an established relation to colorectal cancer. In addition to
replication of these findings in other populations, further
investigation to establish the functional relationship of
these SNPs with colorectal cancer is warranted. With
advances in affordable genotyping technology and
annotation of common human genetic variation, large-
scale analyses such as this study have the potential to
substantially clarify the inherited component of colorec-
tal cancer risk.
We thank Pati Soule, Hardeep Ranu, and Craig Labadie for
laboratory assistance and the dedicated participants of the NHS
and the HPFS for ongoing commitment.
American Cancer Society. Cancer facts and figures. 2006.
Packer BR, Yeager M, Staats B, et al. SNP500Cancer: a public resource
for sequence validation and assay development for genetic variation
in candidate genes. Nucleic Acids Res 2004;32:D528–32.
A haplotype map of the human genome. Nature 2005;437:1299–320.
Packer BR, Yeager M, Burdett L, et al. SNP500Cancer: a public
resource for sequence validation, assay development, and frequency
analysis for genetic variation in candidate genes. Nucleic Acids Res
Steemers FJ, Gunderson KL. Illumina, Inc. Pharmacogenomics 2005;
Foster CB, Aswath K, Chanock SJ, McKay HF, Peters U. Polymorph-
ism analysis of six selenoprotein genes: support for a selective sweep
at the glutathione peroxidase 1 locus (3p21) in Asian populations.
BMC Genet 2006;7:56.
Hughes AL, Packer B, Welch R, Chanock SJ, Yeager M. High level of
functional polymorphism indicates a unique role of natural selection
at human immune system loci. Immunogenetics 2005;57:821–7.
Hunter DJ, Riboli E, Haiman CA, et al. A candidate gene approach to
searching for low-penetrance breast and prostate cancer genes. Nat
Rev Cancer 2005;5:977–85.
Tranah GJ, Giovannucci E, Ma J, Fuchs C, Hunter DJ. APC
Asp1822Val and Gly2502Ser polymorphisms and risk of colorectal
cancer and adenoma. Cancer Epidemiol Biomarkers Prev 2005;14:
10. Tranah GJ, Bugni J, Giovannucci E, et al. O6-methylguanine-DNA
methyltransferase Leu84Phe and Ile143Val polymorphisms and
risk of colorectal cancer in the Nurses’ Health Study and
Physicians’ Health Study (United States). Cancer Causes Control
11. Garcia-Closas M, Malats N, Real FX, et al. Large-scale evaluation of
candidate genes identifies associations between VEGF polymorph-
isms and bladder cancer risk. PLoS Genet 2007;3:e29.
12. Tranah GJ, Lescault PJ, Hunter DJ, De Vivo I. Multiple displacement
amplification prior to single nucleotide polymorphism genotyping in
epidemiologic studies. Biotechnol Lett 2003;25:1031–6.
13. Paynter RA, Skibola DR, Skibola CF, Buffler PA, Wiemels JL, Smith
MT. Accuracy of multiplexed Illumina platform-based single-
nucleotide polymorphism genotyping compared between genomic
and whole genome amplified DNA collected from multiple sources.
Cancer Epidemiol Biomarkers Prev 2006;15:2533–6.
14. SAS. Genetics user’s guide. Cary: SAS Institute, Inc.; 2002.
15. Nyholt DR. A simple correction for multiple testing for single-
nucleotide polymorphisms in linkage disequilibrium with each
other. Am J Hum Genet 2004;74:765–9.
16. Rothe C, Koszycki D, Bradwejn J, et al. Association of the Val158Met
catechol O-methyltransferase genetic polymorphism with panic
disorder. Neuropsychopharmacology 2006;31:2237–42.
17. Sunyaev S, Ramensky V, Bork P. Towards a structural basis of
human non-synonymous single nucleotide polymorphisms. Trends
18. Webb EL, Rudd MF, Sellick GS, et al. Search for low penetrance
alleles for colorectal cancer through a scan of 1467 non-synonymous
SNPs in 2575 cases and 2707 controls with validation by kin-cohort
analysis of 14 704 first-degree relatives. Hum Mol Genet 2006;15:
19. Lindahl T, Demple B, Robins P. Suicide inactivation of the E. coli O6-
methylguanine-DNA methyltransferase. EMBO J 1982;1:1359–63.
20. Pegg AE. Regulation of ornithine decarboxylase. J Biol Chem 2006;
21. Chueh LL, Nakamura T, Nakatsu Y, Sakumi K, Hayakawa H,
Sekiguchi M. Specific amino acid sequences required for O6-
methylguanine-DNA methyltransferase activity: analyses of three
residues at or near the methyl acceptor site. Carcinogenesis 1992;13:
22. Linsalata M, Giannini R, Notarnicola M, Cavallini A. Peroxisome
proliferator-activated receptor g and spermidine/spermine N1-
acetyltransferase gene expressions are significantly correlated in
human colorectal cancer. BMC Cancer 2006;6:191.
23. Babbar N, Gerner EW, Casero RA, Jr. Induction of spermidine/
spermine N1-acetyltransferase (SSAT) by aspirin in Caco-2 colon
cancer cells. Biochem J 2006;394:317–24.
24. Thacker J. The RAD51 gene family, genetic instability and cancer.
Cancer Lett 2005;219:125–35.
25. Modrich P. Mechanisms and biological effects of mismatch repair.
Annu Rev Genet 1991;25:229–53.
26. Stevens RG, Morris JE, Cordis GA, Anderson LE, Rosenberg DW,
Sasser LB. Oxidative damage in colon and mammary tissue of the
HFE-knockout mouse. Free Radic Biol Med 2003;34:1212–6.
27. Shaheen NJ, Silverman LM, Keku T, et al. Association between
hemochromatosis (HFE) gene mutation carrier status and the risk of
colon cancer. J Natl Cancer Inst 2003;95:154–9.
28. Chan AT, Ma J, Tranah GJ, et al. Hemochromatosis gene mutations,
body iron stores, dietary iron, and risk of colorectal adenoma in
women. J Natl Cancer Inst 2005;97:917–26.
29. Mannisto PT, Kaakkola S. Catechol-O-methyltransferase (COMT):
biochemistry, molecular biology, pharmacology, and clinical effica-
cy of the new selective COMT inhibitors. Pharmacol Rev 1999;51:
30. Lotta T, Vidgren J, Tilgmann C, et al. Kinetics of human soluble and
membrane-bound catechol O-methyltransferase: a revised mecha-
nism and description of the thermolabile variant of the enzyme.
31. Dawling S, Roodi N, Mernaugh RL, Wang X, Parl FF. Catechol-O-
methyltransferase (COMT)-mediated metabolism of catechol estro-
gens: comparison of wild-type and variant COMT isoforms. Cancer
32. Thompson PA, Shields PG, Freudenheim JL, et al. Genetic poly-
morphisms in catechol-O-methyltransferase, menopausal status, and
breast cancer risk. Cancer Res 1998;58:2107–10.
Cancer Epidemiology, Biomarkers & Prevention
Cancer Epidemiol Biomarkers Prev 2008;17(2). February 2008