Genome-wide association study identifies novel
alleles associated with risk of cutaneous basal cell
carcinoma and squamous cell carcinoma
Hongmei Nan1, Mousheng Xu3, Peter Kraft3, Abrar A. Qureshi1,2, Constance Chen3, Qun Guo1,
Frank B. Hu1,3,4, Gary Curhan1,3, Christopher I. Amos5, Li-E. Wang5, Jeffrey E. Lee6, Qingyi Wei5,
David J. Hunter1,3,4and Jiali Han1,2,3,∗
1Channing Laboratory, Department of Medicine and2Clinical Research Program, Department of Dermatology,
Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA,3Department of Epidemiology and
4Department of Nutrition, Harvard School of Public Health, Boston, MA, USA,5Department of Epidemiology and
6Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
Received February 10, 2011; Revised June 6, 2011; Accepted June 16, 2011
We conducted a genome-wide association study on cutaneous basal cell carcinoma (BCC) among 2045 cases
and 6013 controls of European ancestry, with follow-up replication in 1426 cases and 4845 controls. A non-
synonymous SNP in the MC1R gene (rs1805007 encoding Arg151Cys substitution), a previously
well-documented pigmentation gene, showed the strongest association with BCC risk in the discovery set
(rs1805007[T]: OR (95% CI) for combined discovery set and replication set [1.55 (1.45–1.66); P 5 4.3 3
10217]. We identified that an SNP rs12210050 at 6p25 near the EXOC2 gene was associated with an increased
risk of BCC [rs12210050[T]: combined OR (95% CI), 1.24 (1.17–1.31); P 5 9.9 3 10210]. In the locus on 13q32
near the UBAC2 gene encoding ubiquitin-associated domain-containing protein 2, we also identified a variant
conferring susceptibility to BCC [rs7335046 [G]; combined OR (95% CI), 1.26 (1.18–1.34); P 5 2.9 3 1028]. We
further evaluated the associations of these two novel SNPs (rs12210050 and rs7335046) with squamous cell
carcinoma (SCC) risk as well as melanoma risk. We found that both variants, rs12210050[T] [OR (95% CI), 1.35
(1.16–1.57); P 5 7.6 3 1025] and rs7335046 [G] [OR (95% CI), 1.21 (1.02–1.44); P 5 0.03], were associated with
an increased risk of SCC. These two variants were not associated with melanoma risk. We conclude that 6p25
and 13q32 are novel loci conferring susceptibility to non-melanoma skin cancer.
Basal cell carcinoma (BCC), a basal keratinocyte tumor in
the epidermis, is the most common form of non-melanoma
skin cancer, followed by squamous cell carcinoma (SCC).
BCC is the most commonly diagnosed cancer among popu-
lations of European ancestry, with more than 1 million new
cases each year in the USA, representing ?80% of all skin
cancer cases (1). Despite this high incidence, BCC is rarely
fatal and uncommonly metastasizes. However, it can cause
clinically significant destruction of surrounding tissues if
not treated adequately. BCC typically occurs in areas
exposed to the sun, and ultraviolet (UV) exposure is the
most important and common environmental risk factor.
The major host susceptibility risk factor of BCC is lighter
pigmentation (2). UV-induced somatic p53 mutations have
frequently been found in BCC cases. In addition, somatic
mutations in the patched 1 (PTCH1) gene, a receptor in
the hedgehog signaling pathway, have been found in most
BCC cases (3). In addition to these rare high-penetrance
alleles, common low-penetrance alleles also contribute to
the genetic susceptibility to BCC. For example, genetic var-
iants in the melanocortin 1 receptor (MC1R) gene, the major
known contributor to skin pigmentation, were associated
with an increased risk of BCC as well as melanoma and
∗To whom correspondence should be addressed at: Channing Laboratory, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical
School, 181 Longwood Avenue, Boston, MA 02115, USA. Tel: +1 6175252098; Email: firstname.lastname@example.org
# The Author 2011. Published by Oxford University Press. All rights reserved.
For Permissions, please email: email@example.com
Human Molecular Genetics, 2011, Vol. 20, No. 18
Advance Access published on June 23, 2011
Recent genome-wide association studies (GWASs) ident-
ified several genetic loci (including 1p36, 1q42, 5p15, 7q32,
9p21, 12q13 and 11q14) that confer susceptibility to BCC
(6,11–13). We have presented the results of these previously
identified susceptibility loci (except for 11q14) in the discov-
ery set of our study in Supplementary Material, Table S1. To
identify additional genetic loci, we performed a multistage
GWAS of BCC. First, to obtain a discovery set, we conducted
a GWAS among 2045 cases of BCC in both men and women
and 6013 controls of European ancestry in the USA (Sup-
plementary Material, Table S2). We combined data from
five case–control studies nested within the Nurses’ Health
Study (NHS) and the Health Professionals Follow-up Study
(HPFS): a type 2 diabetes case–control study nested within
the NHS (T2D_NHS, BCC cases ¼ 665, BCC controls ¼
2,162); a type 2 diabetes case–control study nested within
the HPFS (T2D_HPFS, BCC cases ¼ 597, BCC controls ¼
1555); a coronary heart disease case–control study nested
within the NHS (CHD_NHS, BCC cases ¼ 253, BCC
controls ¼ 765); a coronary heart disease case–control study
nested within the HPFS (CHD_HPFS, BCC cases ¼ 282,
BCC controls ¼ 715) and a postmenopausal invasive breast
cancer case–control study (controls only) nested within the
NHS (BC_NHS, BCC cases ¼ 248, BCC controls ¼ 816).
Second, we conducted a fast-track replication of eight promis-
ing SNPs in the replication set of 1426 BCC cases and 4845
controls (Supplementary Material, Table S2). These cases
and controls in the replication set were from three studies: a
study of 24 h urine composition in individuals with and
without a history of kidney stones within the NHS and
HPFS (KS_NHS_HPFS, BCC cases ¼ 232, BCC controls ¼
703); a BCC case–control study nested within the NHS
(BCC_NHS, BCC cases ¼ 588, BCC controls ¼ 2026) and a
renal function study nested within the NHS (RF_NHS, BCC
cases ¼ 606, BCC controls ¼ 2116). There was no sample
overlap among the five studies of the discovery set and the
three studies of the replication set, nor between the discovery
and replication sets. The study protocol was approved by the
Institutional Review Board of Brigham and Women’s Hospital
and the Harvard School of Public Health.
Detailed descriptions of the population for each study in the
discovery set and replication set are presented in Supplemen-
tary Material, Methods. Both the NHS and HPFS collected
information on self-reported diagnosis of BCC. The definitions
of BCC for each study of discovery set and replication set are
provided in Supplementary Material, Methods.
In each GWAS of the discovery set, those imputed SNPs
with minor allele frequency (MAF) .2.5% and imputation
R2. 0.3 were selected for combined meta-analysis. The
detailed number of SNPs used in each study of the discovery
set was presented in Materials and Methods. A total of
2 318 094 SNPs were finally available for meta-analysis.
The quantile–quantile (Q–Q) plots based on the five individ-
ual GWASs and combined meta-analysis in the discovery set
are presented in Supplementary Material, Figure S1. The
Q–Q plots did not demonstrate a systematic deviation from
the expected distribution, consistent with a minimal likelihood
of systematic genotype error or bias due to underlying popu-
lation substructure. The overall genomic control inflation
factor was lGC¼ 0.996.
We selected top four regions (chromosomes 3, 6, 9 and 13)
for a fast-track replication. To ensure the validity of genotyp-
ing, in each region except for the region of UBAC2 on
chromosome 13, we selected two top SNPs in linkage disequi-
librium (LD) as mutual surrogates (r2. 0.9 in HapMap CEU).
These SNPs were at P-value ,1.5 × 1026in the discovery
set. We excluded SNPs with P-value for heterogeneity test
,0.01. Near the region of UBAC2, an SNP rs7335046 was
ranked number 2 for the association with BCC risk in the dis-
covery set. Although there were other SNPs in complete LD
with the SNP rs7335046, those SNPs showed the P-value
for heterogeneity test ,0.01 in the discovery set of this
study. Hence, we selected the SNP rs12019494 in this
region with P-value for heterogeneity .0.01 (Pheterogeneity¼
0.28) and presenting a modest LD with the SNP rs7335046
(r2¼ 0.4 in HapMap CEU). Moreover, in the discovery set,
we found that a non-synonymous SNP in the MC1R gene
(rs1805007), a previously well-documented pigmentation
gene, was ranked number 1 for the highest association with
BCC risk (rs1805007, P ¼ 5.9 × 1029). We included the
SNP rs1805007 for further replication as well. The imputation
R2and association results of these nine SNPs with BCC risk in
the discovery set are presented in Supplementary Material,
Tables S3 and S4.
We attempted to replicate the associations of those selected
nine SNPs with BCC risk in a replication set of 1426 cases and
4845 controls. Out of nine SNPs selected, in addition to the
SNP rs1805007 in the MC1R gene, two SNPs near the
EXOC2 gene on 6p25 (rs12210050 and rs12202284) and two
SNPs near the UBAC2 gene on 13q32 (rs7335046 and
rs12019494) were replicated with P-value ,0.05 (Supplemen-
tary Material, Table S5). After combining the discovery set
with the replication set, the SNP rs1805007 in the MC1R
gene wasidentifiedas having
[rs1805007[T]: OR (95% CI), 1.55 (1.45–1.66); P ¼ 4.3 ×
10217] (Table 1). Two SNPs, one near the EXOC2 gene
(rs12210050, P ¼ 9.9 × 10210) and the other near the
UBAC2 gene (rs7335046, P ¼ 2.9 × 1028), were also found
to reach genome-wide significant association at the 5.0 ×
1028threshold. The ORs (95% CI) for SNP rs12210050[T]
and rs7335046[G] were 1.24 (1.17–1.31) and 1.26 (1.18–
1.34), respectively (Table 1). No genome-wide significant
results were found for the remaining six SNPs in the combined
set (Supplementary Material, Table S5). The regional associ-
ation plots for both the EXOC2 and UBAC2 regions in the dis-
covery set are presented in Figures 1 and 2. For the region
EXOC2, after adjusting for rs12210050 in the discovery set,
none of the remaining 989 SNPs in this region was significant
at P , 0.001. Similarly, in the region UBAC2, after adjusting
for rs12210050 in the discovery set, none of the remaining 802
SNPs was significant at P , 0.001. It is likely that these ident-
ified markers are both in LD with the causal variants in these
We further evaluated the associations of the three SNPs that
reached genome-wide significance (rs1805007, rs12210050
and rs7335046) with the risk of SCC in 783 incident cases
Human Molecular Genetics, 2011, Vol. 20, No. 18 3719
and 2026 controls nested within the NHS and HPFS (Table 2).
Details of the study population are provided in Supplementary
Material, Methods. All three SNPs were significantly associ-
ated with theriskof SCC:
rs12210050 (P ¼ 7.6 × 1025) and rs7335046 (P ¼ 0.03).
TheORs(95% CI) for
rs12210050[T] and rs7335046[G] were 1.37 (1.12–1.68),
1.35 (1.16–1.57) and 1.21 (1.02–1.44), respectively (Table 2).
Moreover, we evaluated the association of these three SNPs
with melanoma risk in 586 melanoma cases and 2026 controls
rs1805007(P ¼ 0.002),
nested within the NHS and HPFS (set 1). Details of the study
population are described in Supplementary Material, Methods.
The SNP rs1805007[T] was significantly associated with the
risk of melanoma [rs1805007[T]: OR (95% CI), 1.63 (1.32–
2.01); P ¼ 6.0 × 1026]. For rs12210050 and rs7335046, we
also have data from a case–control study of 1804 melanoma
cases and 1027 controls from the MD Anderson Cancer
Center (set 2). Details of the study population are described
in Supplementary Material, Methods. For both rs12210050
and rs7335046, a meta-analysis was used to combine the
Table 1. Association of rs12210050 near the EXOC2 gene, rs7335046 near the UBAC2 gene and rs1805007 in the MC1R gene with the risk of BCC
SNP (major, minor allele) Number of
MAFOR (95% CI)
P-value for heterogeneity
rs12210050 (C, T)
KS_NHS_HPFS (female and male)
Combined set (meta-analysis)
rs7335046 (C, G)
KS_NHS_HPFS (female and male)
Combined set (meta-analysis)
rs1805007 (C, T)
KS_NHS_HPFS (female and male)
Combined set (meta-analysis)
2.3E 2 060.77
4.6E 2 03
1.1E 2 04
9.9E 2 10
9.9E 2 05
1.1E 2 05
2.4E 2 07 0.01
2.9E 2 08
3.5E 2 07
5.9E 2 09 0.22
2.0E 2 03
6.6E 2 04
2.5E 2 05
5.4E 2 10
4.3E 2 17
Results for each GWAS of the discovery set were calculated based on the unconditional logistic regression adjusted for age and top-three principal components of
genetic variance. Results for the KS_NHS_HPFS of the replication set were calculated based on the unconditional logistic regression adjusted for age, gender and
top-three principal components of genetic variance. Results for the BCC_NHS and RF_NHS of the replication set were calculated based on unconditional logistic
regression adjusted for age.
BC_NHS, postmenopausal invasive breast cancer case–control study nested within the NHS; T2D_NHS, type 2 diabetes case–control study nested within the
NHS; T2D_HPFS, type 2 diabetes case–control study nested within the HPFS; CHD_NHS, coronary heart disease case–control study nested within the NHS;
CHD_HPFS, coronary heart disease case–control study nested within the HPFS; KS_NHS_HPFS, kidney stone study nested within the NHS and HPFS;
BCC_NHS, BCC case–control study nested within the NHS; RF_NHS, renal function study nested within the NHS.
aGenotyping data used in the previous publication were included for data analysis (7).
3720 Human Molecular Genetics, 2011, Vol. 20, No. 18
results from the two sets. As shown in Supplementary
Material, Table S6, we did not identify significant associations
between either rs12210050 or rs7335046 and melanoma risk.
The OR (95% CI) for rs12210050 and rs7335046 was 1.07
(0.96–1.19) and 1.01 (0.88–1.15), respectively.
In this study, the SNP rs1805007 was identified with the stron-
gest associations with both melanoma and non-melanoma skin
cancers. MC1R encodes a 317-amino acid seven-pass trans-
rs1805007 encodes an Arg151Cys substitution. A well-known
red hair color variant, the SNP rs1805007, along with other
genetic variants in the MC1R gene, was shown to confer sus-
ceptibility to both melanoma and non-melanoma (BCC and
SCC) skin cancers in our previous study and studies performed
by other groups (4–10). This supports the validity of our
GWAS data and further validates our self-reported BCC data
set. Also, we identified two novel alleles, rs12210050 near
Figure 1. Regional association plot in the 600 kb neighborhood of EXOC2. The left-hand Y-axis shows the association P-value of individual SNPs in the dis-
covery set, which is plotted as 2log10(P) against chromosomal base–pair position. The right-hand Y-axis shows the recombination rate estimated from the
HapMap CEU population. Genotyped SNPs are plotted as diamonds, and imputed as circles in gray. Blue highlights the SNP rs12210050; bright red indicates
high LD (r2≥ 0.8) with rs12210050; orange, moderate LD (r2≥ 0.5 but ,0.8); yellow, weak LD (r2≥ 0.2 but ,0.5) and white, no LD (r2, 0.2). The genomic
coordinate is in NCBI35/hg17.
Figure 2. Regional association plot in the 600 kb neighborhood of PHGDHL1 (UBAC2). The left-hand Y-axis shows the association P-value of individual SNPs
in the discovery set, which is plotted as 2log10(P) against chromosomal base–pair position. The right-hand Y-axis shows the recombination rate estimated from
the HapMap CEU population. Genotyped SNPs are plotted as diamonds, and imputed as circles in gray. Blue highlights the SNP rs7335046; bright red indicates
high LD (r2≥ 0.8) with rs7335046; orange, moderate LD (r2≥ 0.5 but ,0.8); yellow, weak LD (r2≥ 0.2 but ,0.5) and white, no LD (r2, 0.2). The genomic
coordinate is in NCBI35/hg17. The PHGDHL1 is alternatively called UBAC2.
Human Molecular Genetics, 2011, Vol. 20, No. 18 3721
the EXOC2 gene at 6p25 and rs7335046 near the UBAC2 gene
at 13q32, associated with non-melanoma skin cancer. EXOC2
is a component of the exocyst complex involved in the
docking of exocystic vesicles with fusion sites on the plasma
membrane. Some genetic variants in the EXOC2 gene (includ-
ing rs12210050) were identified as contributing to human pig-
mentary traits such as hair color, skin color and tanning
ability, in our previous GWAS on hair color and tanning
ability (14,15). Hence, we performed an additional analysis
for the association between rs12210050 at 6p25 and BCC
risk after further adjusting for pigmentary phenotypes,
tanning tendency and hair color, and the result remained to
reach genome-wide significant association in the combined
discovery set and replication set (P ¼ 1.2 × 1029). At the
same locus 6p25, Sulem et al. (16) previously identified the
SNP rs1540771 conferring susceptibility to pigmentary pheno-
types, including freckling and skin sensitivity to sun.
However, this SNP was not associated with the risks of
BCC and melanoma in the other previous study conducted
by Gudbjartsson et al. (6). The SNP rs1540771 and the SNP
rs12210050 are not in LD (r2¼ 0.05 in HapMap CEU). The
SNP rs1540771 showed nominal association with BCC risk
in the discovery set of this study [rs1540771[C]: OR (95%
CI), 0.93 (0.86–1.00); P ¼ 0.047]. This association was elimi-
nated after adjusting for the SNP rs12210050 (P ¼ 0.42).
The UBAC2 gene encoding ubiquitin-associated domain-
containing protein 2 is alternatively called phosphoglycerate
dehydrogenase-like protein 1 (PHGDHL1). This locus has
been identified as a genetic susceptibility locus for Behc ¸et’s
disease, a chronic systemic inflammatory disease (17).
A possible issue raised in this GWAS is the effect hetero-
geneity. Although five studies were used in the discovery set
of this study, they came from only two demographically
similar cohorts (NHS and HPFS). It is plausible that differ-
ences in the sampling scheme across the five case–control
sub-studies could in principle introduce some effect hetero-
geneity, although this effect is likely to be small (18). To
flag markers that show evidence of effect heterogeneity, we
have calculated Cochran’s Q statistic (19) and reported the
corresponding P-values in the tables. Also, given the large
number of SNPs (more than 2 million SNPs) analyzed in
this study, nominally significant P-values for heterogeneity
are difficult to interpret, and may represent false positives
due to sampling variation. For example, although there is
some evidence of heterogeneity for the SNP rs7335046 in
the discovery set (P ¼ 0.01), the P-value for heterogeneity
of this SNP in either replication set or combined set was not
significant. In addition, as mentioned above, considering the
number of SNPs analyzed in this study, the P-value of 0.01
for heterogeneity in the discovery set is more likely attribu-
table to chance. Still, we have taken a conservative approach
and excluded the SNPs with P-values for heterogeneity test
,0.01 from further consideration for replication.
In this study, BCC cases used for data analysis were self-
reported. The validity of self-report of BCC in these medically
sophisticated populations has been assessed in previous studies
(20,21). Colditz et al. (20) evaluated the validity of self-
reported illnesses including skin cancer in the NHS. Among
33 random samples of women who had reported non-
melanoma skin cancer, medical records indicated that 30
(91%) had correctly reported the skin cancer. The three incor-
rect self-reports were actinic keratosis, a premalignant skin
lesion. Also, Hunter et al. (21) previously examined the risk
factors of BCC in the NHS using the self-reported cases. As
expected, they found that lighter pigmentation (blonde or red
hair color), less childhood and adolescent tanning tendency
Table 2. Association of rs12210050 near the EXOC2 gene, rs7335046 near the UBAC2 gene and rs1805007 in the MC1R gene with the risk of SCC
SNP (major, minor allele) Genotype Cases (%) Controls (%) OR (95% CI)
rs12210050 (C, T)
CT or TT
7.6E 2 05
P-value for trend
rs7335046 (C, G)
CG or GG
P-value for trend
rs1805007 (C, T)
CT or TT
P-value for trend
The ORs (95% CIs) were calculated based on the unconditional logistic regression adjusted for age and gender.
3722 Human Molecular Genetics, 2011, Vol. 20, No. 18
and higher tendency to sunburn were associated with an
increased risk of BCC. Also, they found that women residing
in California and Florida were more likely to develop BCC
compared with women living in the Northeast. In addition,
using the self-reported BCC cases, we identified the previously
(rs1805007) as the strongest locus in this study. These data
support the validity of self-report of BCC in our study.
It is possible that the similar biases are present in both the
discovery set and replication set because they were from two
large cohort studies, the NHS and the HPFS. In the discovery
set of this study, 43% of BCC cases were men, whereas 10%
of BCC cases were men in the replication set. Also, we note
that there are some differences between the two cohorts,
such as gender (the NHS is female cohort, and the HPFS is
male cohort), geographical background and social economic
In summary, in the current GWAS of individuals of Euro-
pean ancestry, we identified two novel loci, the EXOC2 gene
on 6p25 and the UBAC2 gene on 13q32, as associated with
the risks of non-melanoma skin cancer, BCC and SCC. In
addition, we verified the skin cancer susceptibility locus at
the MC1R gene on 16q24. Future studies are warranted to
evaluate the effect of interactions between these promising
SNPs and skin cancer risk factors on the risk of skin cancer.
Understanding the role of these novel loci in the development
of non-melanoma skin cancer could provide important insight
into non-melanoma skin cancer pathogenesis and effectively
improve the prevention of non-melanoma skin cancer.
MATERIALS AND METHODS
Description of study populations
Nurses’ Health Study. The NHS was established in 1976, when
121 700 female US registered nurses between the ages of 30
and 55, residing in 11 larger US states, completed and returned
an initial self-administered questionnaire on their medical his-
tories and baseline health-related exposures, forming the basis
for the NHS cohort. Biennial questionnaires with the collec-
tion of exposure information on risk factors have been col-
lected prospectively. Overall, follow-up has been very high;
after .20 years, ?90% of participants continue to complete
questionnaires. From May 1989 through September 1990,
we collected blood samples from 32 826 participants in the
Health Professionals Follow-up Study. In 1986, 51 529 men
from all 50 US states in health professions (dentists, pharma-
cists, optometrists, osteopath physicians, podiatrists and veter-
inarians) aged 40–75answered
questionnaire, forming the basis of the study. The average
follow-up rate for this cohort over 10 years is .90%.
Between 1993 and 1994, 18 159 study participants provided
blood samples by overnight courier.
Skin cancer ascertainment in NHS and HPFS. Disease
follow-up procedures are identical for both the NHS and
HPFS. Along with exposures every 2 years, outcome data
with appropriate follow-up of reported disease events
including melanoma and non-melanoma skin cancers are col-
lected. For melanoma and SCC, eligible cases are incident
pathologically confirmed invasive cases among subjects who
gave a blood specimen in the NHS and HPFS with a diagnosis
anytime after blood collection. All medical records of mela-
noma and SCC are reviewed by dermatologists blinded to
exposure information according to established criteria. Cases
of BCC are not pathologically confirmed in the NHS
Genotyping in each GWAS of the discovery set. We performed
genotyping in BC_NHS, using the Illumina HumanHap550
array, as part of the National Cancer Institute’s Cancer
Genetic Markers of Susceptibility (CGEMS) Project (22).
For the other four GWASs of the discovery set, we performed
genotyping using the Affymetrix 6.0 array.
Genotyping in the replication set. Nine promising SNPs from
the discovery set were selected for further replication in the
replication set. (i) The genotyping for the KS_NHS_HPFS
was performed using the Illumina HumanHap610 Quad, and
the imputation was performed in the same fashion as in the
discovery set. The genotype data we extracted for these nine
SNPs and their imputation quality data are presented in Sup-
plementary Material, Table S3. (ii) The genotyping for the
BCC_NHS and RF_NHS was performed using OpenArray
assays at the Dana Farber/Harvard Cancer Center Polymorph-
ism Detection Core.
Imputation and statistical methods
In each study of the discovery set, we used MACH v1.0.16 to
impute more than 2.5 million SNPs with HapMap CEU phase
II data (release 22) as the reference panel (23). Imputation
results were expressed as ‘allele dosages’ (fractional values
between 0 and 2). Those MACH dosage files were used for
analysis of imputed data. Imputation R2is an estimate of cor-
relation between observed and predicted genotype. It is the
ratio of observed variance to the theoretical variance (23).
The number of genotyped SNPs passed quality control pro-
cedures and the imputed SNPs with MAF .2.5% and imputa-
tion R2. 0.3 in each study of the discovery set are presented
2 352 569
2 351 699
2 356 842
2 350 863
2 356 504
We fitted an unconditional logistic regression model for
each SNP that passed quality control filters, using an additive
model, controlling for age and the three largest principal com-
ponents of genetic variation of each GWAS of the discovery
set and the KS_NHS_HPFS of the replication set. These prin-
cipal components were calculated for all individuals on the
basis of approximately 10 000 unlinked markers, using the
Human Molecular Genetics, 2011, Vol. 20, No. 183723
EIGENSTRAT software (24). In the other two replication sets Download full-text
of BCC (BCC_NHS and RF_NHS) as well as SCC and mela-
noma sets, each SNP was tested for an association with skin
cancer risk by unconditional logistic regression model adjust-
ing for age and gender.
In each study of the discovery set, those SNPs with MAF
.2.5% and imputation R2. 0.3 in each study of the discov-
ery set were included in further meta-analysis. Estimated log
odds ratios from each study of the discovery set were com-
bined using meta-analysis, with weights proportional to the
inverse variance of the estimate in each study. The same
meta-analysis method was used to combine the results from
the discovery set and replication set.
Supplementary Material is available at HMG online.
We thank Dr Wei V. Chen for assistance in performing ana-
lyses in the MD Anderson Cancer Center melanoma case–
control study. We thank Pati Soule and Dr Hardeep Ranu of
the Dana Farber/Harvard Cancer Center High-Throughput
Polymorphism Detection Core for sample handling and geno-
typing of the NHS and HPFS samples. We are also indebted to
the participants in all of these studies. We thank the following
state cancer registries for their help: AL, AZ, AR, CA, CO,
CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA,
MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC,
TN, TX, VA, WA, WY.
Conflict of Interest statement. None declared.
We are grateful to Merck Research Laboratories for funding of
the GWAS of coronary heart disease. This work is supported
by NIH grants CA122838, CA87969, CA055075, CA49449,
CA100264 and CA093459.
1. Miller, D.L. and Weinstock, M.A. (1994) Nonmelanoma skin cancer in
the United States: incidence. J. Am. Acad. Dermatol., 30, 774–778.
2. Han, J., Colditz, G.A. and Hunter, D.J. (2006) Risk factors for skin
cancers: a nested case–control study within the Nurses’ Health Study.
Int. J. Epidemiol., 35, 1514–1521.
3. Epstein, E.H. (2008) Basal cell carcinomas: attack of the hedgehog. Nat.
Rev. Cancer, 8, 743–754.
4. Bastiaens, M.T., ter Huurne, J.A., Kielich, C., Gruis, N.A., Westendorp,
R.G., Vermeer, B.J. and Bavinck, J.N. (2001) Melanocortin-1 receptor
gene variants determine the risk of nonmelanoma skin cancer
independently of fair skin and red hair. Am. J. Hum. Genet., 68, 884–894.
5. Box, N.F., Duffy, D.L., Irving, R.E., Russell, A., Chen, W., Griffyths,
L.R., Parsons, P.G., Green, A.C. and Sturm, R.A. (2001) Melanocortin-1
receptor genotype is a risk factor for basal and squamous cell carcinoma.
J. Invest. Dermatol., 116, 224–229.
6. Gudbjartsson, D.F., Sulem, P., Stacey, S.N., Goldstein, A.M., Rafnar, T.,
Sigurgeirsson, B., Benediktsdottir, K.R., Thorisdottir, K., Ragnarsson, R.,
Sveinsdottir, S.G. et al. (2008) ASIP and TYR pigmentation variants
associate with cutaneous melanoma and basal cell carcinoma. Nat. Genet.,
7. Han, J., Kraft, P., Colditz, G.A., Wong, J. and Hunter, D.J. (2006)
Melanocortin 1 receptor variants and skin cancer risk. Int. J. Cancer, 119,
8. Kennedy, C., ter Huurne, J., Berkhout, M., Gruis, N., Bastiaens, M.,
Bergman, W., Willemze, R. and Bavinck, J.N. (2001) Melanocortin 1
receptor (MC1R) gene variants are associated with an increased risk for
cutaneous melanoma which is largely independent of skin type and hair
color. J. Invest. Dermatol., 117, 294–300.
9. Palmer, J.S., Duffy, D.L., Box, N.F., Aitken, J.F., O’Gorman, L.E.,
Green, A.C., Hayward, N.K., Martin, N.G. and Sturm, R.A. (2000)
Melanocortin-1 receptor polymorphisms and risk of melanoma: is the
association explained solely by pigmentation phenotype? Am. J. Hum.
Genet., 66, 176–186.
10. Valverde, P., Healy, E., Sikkink, S., Haldane, F., Thody, A.J., Carothers,
A., Jackson, I.J. and Rees, J.L. (1996) The Asp84Glu variant of the
melanocortin 1 receptor (MC1R) is associated with melanoma. Hum. Mol.
Genet., 5, 1663–1666.
11. Rafnar, T., Sulem, P., Stacey, S.N., Geller, F., Gudmundsson, J.,
Sigurdsson, A., Jakobsdottir, M., Helgadottir, H., Thorlacius, S., Aben,
K.K. et al. (2009) Sequence variants at the TERT-CLPTM1L locus
associate with many cancer types. Nat. Genet., 41, 221–227.
12. Stacey, S.N., Gudbjartsson, D.F., Sulem, P., Bergthorsson, J.T., Kumar,
R., Thorleifsson, G., Sigurdsson, A., Jakobsdottir, M., Sigurgeirsson, B.,
Benediktsdottir, K.R. et al. (2008) Common variants on 1p36 and 1q42
are associated with cutaneous basal cell carcinoma but not with melanoma
or pigmentation traits. Nat. Genet., 40, 1313–1318.
13. Stacey, S.N., Sulem, P., Masson, G., Gudjonsson, S.A., Thorleifsson, G.,
Jakobsdottir, M., Sigurdsson, A., Gudbjartsson, D.F., Sigurgeirsson, B.,
Benediktsdottir, K.R. et al. (2009) New common variants affecting
susceptibility to basal cell carcinoma. Nat. Genet., 41, 909–914.
14. Han, J., Kraft, P., Nan, H., Guo, Q., Chen, C., Qureshi, A., Hankinson,
S.E., Hu, F.B., Duffy, D.L., Zhao, Z.Z. et al. (2008) A genome-wide
association study identifies novel alleles associated with hair color and
skin pigmentation. PLoS Genet., 4, e1000074.
15. Nan, H., Kraft, P., Qureshi, A.A., Guo, Q., Chen, C., Hankinson, S.E., Hu,
F.B., Thomas, G., Hoover, R.N., Chanock, S. et al. (2009) Genome-wide
association study of tanning phenotype in a population of European
ancestry. J. Invest. Dermatol., 129, 2250–2257.
16. Sulem, P., Gudbjartsson, D.F., Stacey, S.N., Helgason, A., Rafnar, T.,
Magnusson, K.P., Manolescu, A., Karason, A., Palsson, A., Thorleifsson,
G. et al. (2007) Genetic determinants of hair, eye and skin pigmentation in
Europeans. Nat. Genet., 39, 1443–1452.
17. Fei, Y., Webb, R., Cobb, B.L., Direskeneli, H., Saruhan-Direskeneli, G.
and Sawalha, A.H. (2009) Identification of novel genetic susceptibility
loci for Behcet’s disease using a genome-wide association study. Arthritis
Res. Ther., 11, R66.
18. Monsees, G.M., Tamimi, R.M. and Kraft, P. (2009) Genome-wide
association scans for secondary traits using case–control samples. Genet.
Epidemiol., 33, 717–728.
19. Higgins, J.P. and Thompson, S.G. (2002) Quantifying heterogeneity in a
meta-analysis. Stat. Med., 21, 1539–1558.
20. Colditz, G.A., Martin, P., Stampfer, M.J., Willett, W.C., Sampson, L.,
Rosner, B., Hennekens, C.H. and Speizer, F.E. (1986) Validation of
questionnaire information on risk factors and disease outcomes in a
prospective cohort study of women. Am. J. Epidemiol., 123, 894–900.
21. Hunter, D.J., Colditz, G.A., Stampfer, M.J., Rosner, B., Willett, W.C. and
Speizer, F.E. (1990) Risk factors for basal cell carcinoma in a prospective
cohort of women. Ann. Epidemiol., 1, 13–23.
22. Hunter, D.J., Kraft, P., Jacobs, K.B., Cox, D.G., Yeager, M., Hankinson,
S.E., Wacholder, S., Wang, Z., Welch, R., Hutchinson, A. et al. (2007)
A genome-wide association study identifies alleles in FGFR2 associated
with risk of sporadic postmenopausal breast cancer. Nat. Genet., 39,
23. Li, Y., Willer, C.J., Ding, J., Scheet, P. and Abecasis, G.R. (2010) MaCH:
using sequence and genotype data to estimate haplotypes and unobserved
genotypes. Genet. Epidemiol., 34, 816–834.
24. Price, A.L., Patterson, N.J., Plenge, R.M., Weinblatt, M.E., Shadick, N.A.
and Reich, D. (2006) Principal components analysis corrects for
stratification in genome-wide association studies. Nat. Genet., 38,
3724 Human Molecular Genetics, 2011, Vol. 20, No. 18