ArticlePDF Available

Common variants in breast cancer risk loci predispose to distinct tumor subtypes

Authors:

Abstract and Figures

Background Genome-wide association studies (GWAS) have identified multiple common breast cancer susceptibility variants. Many of these variants have differential associations by estrogen receptor (ER) status, but how these variants relate with other tumor features and intrinsic molecular subtypes is unclear. Methods Among 106,571 invasive breast cancer cases and 95,762 controls of European ancestry with data on 173 breast cancer variants identified in previous GWAS, we used novel two-stage polytomous logistic regression models to evaluate variants in relation to multiple tumor features (ER, progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) and grade) adjusting for each other, and to intrinsic-like subtypes. Results Eighty-five of 173 variants were associated with at least one tumor feature (false discovery rate < 5%), most commonly ER and grade, followed by PR and HER2. Models for intrinsic-like subtypes found nearly all of these variants (83 of 85) associated at p < 0.05 with risk for at least one luminal-like subtype, and approximately half (41 of 85) of the variants were associated with risk of at least one non-luminal subtype, including 32 variants associated with triple-negative (TN) disease. Ten variants were associated with risk of all subtypes in different magnitude. Five variants were associated with risk of luminal A-like and TN subtypes in opposite directions. Conclusion This report demonstrates a high level of complexity in the etiology heterogeneity of breast cancer susceptibility variants and can inform investigations of subtype-specific risk prediction.
Content may be subject to copyright.
Ahearnetal. Breast Cancer Research (2022) 24:2
https://doi.org/10.1186/s13058-021-01484-x
RESEARCH ARTICLE
Common variants inbreast cancer risk loci
predispose todistinct tumor subtypes
Thomas U. Ahearn1† , Haoyu Zhang1,2†, Kyriaki Michailidou3,4,5, Roger L. Milne6,7,8, Manjeet K. Bolla4,
Joe Dennis4, Alison M. Dunning9, Michael Lush4, Qin Wang4, Irene L. Andrulis10,11, Hoda Anton‑Culver12,
Volker Arndt13, Kristan J. Aronson14, Paul L. Auer15,16, Annelie Augustinsson17, Adinda Baten18, Heiko Becher19,
Sabine Behrens20, Javier Benitez21,22, Marina Bermisheva23,24, Carl Blomqvist25,26, Stig E. Bojesen27,28,29,
Bernardo Bonanni30, Anne‑Lise Børresen‑Dale31,32, Hiltrud Brauch33,34,35, Hermann Brenner13,36,37,
Angela Brooks‑Wilson38,39, Thomas Brüning40, Barbara Burwinkel41,42, Saundra S. Buys43, Federico Canzian44,
Jose E. Castelao45, Jenny Chang‑Claude20,46, Stephen J. Chanock1, Georgia Chenevix‑Trench47,
Christine L. Clarke48, NBCS Collaborators, J. Margriet Collée49, Angela Cox50, Simon S. Cross51, Kamila Czene52,
Mary B. Daly53, Peter Devilee54,55, Thilo Dörk56, Miriam Dwek57, Diana M. Eccles58, D. Gareth Evans59,60,
Peter A. Fasching61, Jonine Figueroa62,63, Giuseppe Floris18, Manuela Gago‑Dominguez64,65, Susan M. Gapstur66,
José A. García‑Sáenz67, Mia M. Gaudet66, Graham G. Giles6,7,8, Mark S. Goldberg68,69, Anna González‑Neira21,
Grethe I. Grenaker Alnæs31, Mervi Grip70, Pascal Guénel71, Christopher A. Haiman72, Per Hall52,73, Ute Hamann74,
Elaine F. Harkness75,76,77, Bernadette A. M. Heemskerk‑Gerritsen78, Bernd Holleczek79, Antoinette Hollestelle78,
Maartje J. Hooning78, Robert N. Hoover1, John L. Hopper7, Anthony Howell80, ABCTB Investigators, kConFab/
AOCS Investigators, Milena Jakimovska81, Anna Jakubowska82,83, Esther M. John84,85, Michael E. Jones86,
Audrey Jung20, Rudolf Kaaks20, Saila Kauppila87, Renske Keeman88, Elza Khusnutdinova89,23, Cari M. Kitahara90,
Yon‑Dschun Ko91, Stella Koutros1, Vessela N. Kristensen32,92, Ute Krüger17, Katerina Kubelka‑Sabit93,
Allison W. Kurian84,85, Kyriacos Kyriacou94,5, Diether Lambrechts95,96, Derrick G. Lee97,98, Annika Lindblom99,100,
Martha Linet90, Jolanta Lissowska101, Ana Llaneza102, Wing‑Yee Lo33,103, Robert J. MacInnis6,7,
Arto Mannermaa104,105,106, Mehdi Manoochehri74, Sara Margolin73,107, Maria Elena Martinez65,
Catriona McLean108, Alfons Meindl109, Usha Menon110, Heli Nevanlinna111, William G. Newman59,60,
Jesse Nodora65,112, Kenneth Offit113, Håkan Olsson17, Nick Orr114, Tjoung‑Won Park‑Simon56, Alpa V. Patel66,
Julian Peto115, Guillermo Pita116, Dijana Plaseska‑Karanfilska81, Ross Prentice15, Kevin Punie117, Katri Pylkäs118,119,
Paolo Radice120, Gad Rennert121, Atocha Romero122, Thomas Rüdiger123, Emmanouil Saloustros124,
Sarah Sampson125, Dale P. Sandler126, Elinor J. Sawyer127, Rita K. Schmutzler128,129,130, Minouk J. Schoemaker86,
Ben Schöttker13,131, Mark E. Sherman132, Xiao‑Ou Shu133, Snezhana Smichkoska134, Melissa C. Southey6,135,8,
John J. Spinelli136,137, Anthony J. Swerdlow86,138, Rulla M. Tamimi139, William J. Tapper58, Jack A. Taylor126,140,
Lauren R. Teras66, Mary Beth Terry141, Diana Torres142,74, Melissa A. Troester143, Celine M. Vachon144,
© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco
mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Open Access
*Correspondence: montserrat.garcia‑closas@nih.gov
Thomas U. Ahearn, Haoyu Zhang, Montserrat García‑Closas, and Nilanjan
Chatterjee have contributed equally to this work
1 Division of Cancer Epidemiology and GeneticsDepartment of Health
and Human Services, Medical Center Drive, National Cancer Institute,
National Institutes of Health, Rockville, MD, USA
Full list of author information is available at the end of the article
Page 2 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
Introduction
Breast cancer represents a heterogenous group of dis-
eases with different molecular and clinical features[1].
Clinical assessment of estrogen receptor (ER), proges-
terone receptor (PR), human epidermal growth factor
receptor 2 (HER2) and histological grade are routinely
determined to inform treatment strategies and prog-
nostication[2]. Combined, these tumor features define
five intrinsic-like subtypes (i.e., luminal A-like, luminal
B–like/HER2-negative, luminal B-like/HER2-positive,
HER2-positive/non-luminal, and triple-negative) that are
correlated with intrinsic subtypes defined by gene expres-
sion panels[2, 3]. Most known breast cancer risk or pro-
tective factors are related to luminal or hormone receptor
(ER or PR) positive tumors, whereas less is known about
the etiology of triple-negative (TN) tumors, an aggressive
subtype[4, 5].
Breast cancer genome-wide association studies
(GWAS) have identified over 170 common susceptibility
variants, most of them single nucleotide polymorphisms
(SNPs), of which many are differentially associated with
ER-positive than ER-negative disease[68]. ese include
20 variants that primarily predispose to ER-negative or
TN disease[7, 8]. However, few studies have evaluated
variant associations with other tumor features, or simul-
taneously studied multiple, correlated tumor markers
to identify source(s) of etiologic heterogeneity[7, 913].
We recently developed a two-stage polytomous logistic
regression method that efficiently characterizes etiologic
heterogeneity while accounting for tumor marker corre-
lations and missing tumor data[14, 15]. is method can
help describe complex relationships between susceptibil-
ity variants and multiple tumor features, helping to clar-
ify breast cancer subtype etiologies and increasing the
power to generate more accurate risk estimates between
susceptibility variants and less common subtypes. We
recently demonstrated the power of this method in a
GWAS to identify novel breast cancer susceptibility
accounting for tumor heterogeneity[15].
In this report, we sought to expand our understanding
of etiologic heterogeneity across breast cancer subtypes,
by applying the two-stage polytomous logistic regression
methodology to a large study population from the Breast
Cancer Association Consortium (BCAC) for detailed
characterization of risk associations with 173 breast can-
cer risk variants identified by GWAS[6, 7] by tumor sub-
types defined by ER, PR, HER2 and tumor grade.
Abstract
Background: Genome‑wide association studies (GWAS) have identified multiple common breast cancer suscepti‑
bility variants. Many of these variants have differential associations by estrogen receptor (ER) status, but how these
variants relate with other tumor features and intrinsic molecular subtypes is unclear.
Methods: Among 106,571 invasive breast cancer cases and 95,762 controls of European ancestry with data on 173
breast cancer variants identified in previous GWAS, we used novel two‑stage polytomous logistic regression models
to evaluate variants in relation to multiple tumor features (ER, progesterone receptor (PR), human epidermal growth
factor receptor 2 (HER2) and grade) adjusting for each other, and to intrinsic‑like subtypes.
Results: Eighty‑five of 173 variants were associated with at least one tumor feature (false discovery rate < 5%), most
commonly ER and grade, followed by PR and HER2. Models for intrinsic‑like subtypes found nearly all of these variants
(83 of 85) associated at p < 0.05 with risk for at least one luminal‑like subtype, and approximately half (41 of 85) of the
variants were associated with risk of at least one non‑luminal subtype, including 32 variants associated with triple‑
negative (TN) disease. Ten variants were associated with risk of all subtypes in different magnitude. Five variants were
associated with risk of luminal A‑like and TN subtypes in opposite directions.
Conclusion: This report demonstrates a high level of complexity in the etiology heterogeneity of breast cancer sus‑
ceptibility variants and can inform investigations of subtype‑specific risk prediction.
Keywords: Breast cancer, Etiologic heterogeneity, Genetic predisposition, Common breast cancer susceptibility
variants
Carolien H. M. van Deurzen145, Elke M. van Veen59,60, Philippe Wagner17, Clarice R. Weinberg146,
Camilla Wendt73,107, Jelle Wesseling88,147, Robert Winqvist118,119, Alicja Wolk148,149, Xiaohong R. Yang1,
Wei Zheng133, Fergus J. Couch150, Jacques Simard151, Peter Kraft152,153, Douglas F. Easton9,4, Paul D. P. Pharoah9,4,
Marjanka K. Schmidt88,154, Montserrat García‑Closas1*† and Nilanjan Chatterjee155,156†
Page 3 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
Methods
Study population andgenotyping
e study population and genotyping are described in
previous publications[6, 7] and in the Additional file3:
Methods. We included invasive cases and controls from
81 BCAC studies with genotyping data from two Illumina
genome-wide custom arrays, the iCOGS and OncoArray
(106,571 cases (OncoArray: 71,788; iCOGS: 34,783) and
95,762 controls (OncoArray: 58,134; iCOGS: 37,628);
Additional file 1: Table S1). All subjects in the study
population were female and of European ancestry, with
European ancestry determined by ancestry informative
GWAS markers as previously described [6]. We evaluated
173 breast cancer risk variantsthat were identified in or
replicated by prior BCAC analyses to be associated with
breast cancer risk at a p-value threshold p < 5.0 × 10–8
[6, 7]. Most of these variants (n = 153) were identified
because of their association with risk of overall breast
cancer, and a small number of variants (n = 20) were
identified because of their association specific to ER-neg-
ative breast cancer (Additional file1: TableS2). ese 173
variants have not previously been simultaneously investi-
gated for evidence of tumor heterogeneity with multiple
tumor markers[6, 7, 15, 16]. Genotypes for the variants
marking the 173 susceptibility loci were determined by
genotyping with the iCOGS and the OncoArray arrays
and imputation to the 1000 Genomes Project (Phase 3)
reference panel.
Statistical analysis
An overview of the analytic strategy is shown in Fig.1 and
a detailed discussion of the statistical methods, including
the two-stage polytomous logistic regression, are pro-
vided in the Additional file3: Methods and elsewhere[14,
15]. Briefly, we used two-stage polytomous regression
models that allow modelling of genetic association of
breast cancer accounting for underlying heterogeneity
in associations by combinations of multiple tumor mark-
ers using a parsimonious decomposition of subtype-
specific case–control odds-ratio parameters in terms of
marker-specific case-case odd-ratio parameters[14, 15].
We introduced further parsimony by using the mixed-
effect formulation of the model that allows ER-specific
case-case parameters to be treated as fixed and similar
parameters for other markers (PR, HER2 and grade (as
an ordinal variable)) as random. We used an expecta-
tion–maximization (EM) algorithm[17] for parameter
estimation under this model to account for missing data
in tumor characteristics.
Our primary aim was to identify which of 173 known
breast cancer susceptibility variants showed heterog-
enous risk associations by ER-, PR- and HER2-status and
tumor grade. is was tested using a global heterogeneity
test by ER, PR, HER2 and/or grade, with a mixed-effect
two-stage polytomous model (model 1), fitted separately
for each variant. e global null hypothesis was that
there was no difference in risk of breast cancer associated
with the variant genotype across any of the tumor fea-
tures being evaluated. We accounted for multiple testing
(173 tests, one for each variant) of the global heterogene-
ity test using a false discovery rate (FDR) < 5% under the
Benjamini–Hochberg procedure[18].
For the variants showing evidence of global heteroge-
neity after FDR adjustment, we further evaluated which
of the tumor features contributed to the heterogene-
ity by fitting a fixed-effects two-stage model (model 2)
that simultaneously tested for associations with each
tumor feature (this model was fitted for each variant
separately). We used a threshold of p < 0.05 for marker-
specific tumor heterogeneity tests to describe which
specific tumor marker(s) contributed to the observed
heterogeneity, adjusting for the other tumor markers
in the model. is p-value threshold was used only for
descriptive purposes, as the primary hypotheses were
tested using the FDR-adjusted global test for heteroge-
neity described above.
We conducted additional analyses to explore forevi-
dence of heterogeneity. We fitted a fixed-effect two-stage
model (model 3) to estimate case–control odd ratios
(ORs) and 95% confidence intervals (CI) between the
variants and five intrinsic-like subtypes defined by com-
binations of ER, PR, HER2 and grade: (1) luminal A-like
(ER + and/or PR + , HER2-, grade 1 or 2); (2) luminal
B-like/HER2-negative (ER + and/or PR + , HER2-, grade
3); (3) luminal B-like/HER2-positive (ER + and/or PR + ,
HER2 +); (4) HER2-positive/non-luminal (ER- and PR-,
HER2 +), and (5) TN (ER-, PR-, HER2-). We also fitted
a fixed-effect two-stage model to estimate case–control
ORs and 95% confidence intervals (CI) with tumor grade
(model 4; defined ordinally as grade 1, grade 2, and grade
3) for the variants associated at p < 0.05 only with grade
in case-case comparisons from model 2.
To help describe sources of heterogeneity from dif-
ferent tumor characteristics in models 2 and 3, we per-
formed cluster analyses based on Euclidean distance
calculated from the absolute z-statistics that were esti-
mated by the individual marker-specific tumor hetero-
geneity tests (model 2) and the case–control associations
with risk of intrinsic-like subtypes (model 3). e clusters
were used only for presentation purposes and were not
intended to suggest strictly defined categories, nor are
they intended to suggest the variants are associated with
tumor markers through similar biological mechanisms.
Clustering was performed in R using the function Heat-
map as implemented by the package “Complex Heat-
map” version 3.1[19]. Additional details for calculating
Page 4 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
Euclidean distance using absolute z-statistics are pro-
vided in Additional file3: Methods.
We performed sensitivity analyses, in which we esti-
mated the ORs and 95% CI between the variants and the
intrinsic-like subtypes by implementing a standard poly-
tomous model that defined the intrinsic-like subtypes
using only the available tumor markers data (not using
the EM algorithm to account for missing data in tumor
markers). We analyzed OncoArray and iCOGS array data
separatelyfor all analyses, adjusting for the first ten prin-
cipal components for ancestry-informative variants, and
then meta-analyzed the results.
Results
e mean (SD) ages at diagnosis (cases) and enroll-
ment (controls) were 56.6 (12.2) and 56.4 (12.2) years,
respectively. Among cases with information on the cor-
responding tumor marker, 81% were ER-positive, 68%
PR-positive, 83% HER2-negative and 69% grade 1 or
2 (Table1; see Additional file1: TableS1 for details by
study). Additional file 1: Table S3 shows the correla-
tion between the tumor markers. ER was positively cor-
related with PR (r = 0.61) and inversely correlated with
HER2 (r = -0.16) and grade (r = -0.39). e most com-
mon intrinsic-like subtype was luminal A-like (54%),
followed by TN (14%), luminal B-like/HER2-negative
(13%), Luminal B-like/HER2-positive (13%) and HER2-
positive/non-luminal (6%; Table 1). ese frequen-
cies varied across BCAC studies because the studies
werediverse in both design and country of origin (Addi-
tional file1: TableS1). Notably, there is little population-
based data on the frequencies of intrinsic-like subtypes
[20, 21]. e overall frequencies in our study population
are generally similar to those reported by SEER for non-
Hispanic white females and the Scottish cancer registry
[20, 21]; however, given the diverse sources of our data,
Fig. 1 Overview of the analytic strategy and results from the investigation of 173 known breast cancer susceptibility variants for evidence of
heterogeneity of effect according to the estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2),
and grade. aWe evaluated 173 breast cancer risk variants identified in or replicated by prior BCAC GWAS [6, 7], see Methods and Additional file 3:
Methods sections for more details. bModel 1 (primary analyses): Mixed‑effect two‑stage polytomous model (ER as fixed‑effect, and PR, HER2 and
grade as random‑effects) for global heterogeneity tests (i.e. case‑case comparisons from stage 2 of the two‑stage model) between each individual
risk variant and any of the tumor features (separate models were fit for each variant). cModel 2: Fixed‑effect two‑stage polytomous model for
marker‑specific tumor heterogeneity tests (i.e. case‑case comparisons from stage 2 of the two‑stage model) between each individual variant and
each of the tumor features (ER, PR, HER2, and grade), mutually adjusted for each other (separate models were fit for each variant). dModel 3: Fixed
effect two‑stage polytomous model for risk associations with intrinsic‑like subtypes (i.e. case–control comparisons from stage 1 of the two‑stage
model): luminal A‑like, luminal B‑like/HER2‑negative, luminal B‑like/HER2‑positive, HER2‑positive/non‑luminal, and triple‑negative. eModel 4: Fixed
effect two‑stage polytomous model for risk associations with tumor grade (i.e. case–control comparisons from stage 1 of the two‑stage model) for
the 12 variants associated at p < 0.05 only with grade in case‑case comparisons (from model 2): grade 1, grade 2, and grade 3
Page 5 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
they are not directly comparable to country-specific can-
cer registries.
Figure1 shows an overview of the analytic strategy and
results from three main analyses performed separately
for each variant: 1) global test for heterogeneity by all
tumor markers (model 1; primary hypothesis), 2) marker-
specific tumor test for heterogeneity for each marker,
adjusting for the others (model 2), and 3) estimation of
case–control ORs (95%CIs) by intrinsic-like subtypes
(model 3) and by grade (model 4).
Global test forheterogeneity bytumor markers (primary
hypothesis)
Mixed-effects two-stage models (model 1) were fitted for
each of the 173 variants separately and included terms for
ER, PR, HER2 and grade to test for global heterogeneity
by any of the tumor features (case-case comparison). is
model identified 85 of 173 (49.1%) variants with evidence
of heterogeneity by at least one tumor feature (FDR < 5%;
Figs.1, 2; Additional file1: Fig.S1).
Marker‑specic tumor test forheterogeneity foreach
marker, adjusting forother markers
Fixed-effects two-stage models (model 2) were used to
test which of the correlated tumor markers was respon-
sible for the observed global heterogeneity (case-case
comparison). Figure2 and Additional file1: Fig.S1 show
results of these analyses clustered by case-case z-values
of associations between susceptibility variants and each
tumor marker for the 173 variants. For the 85 variants
with observed global heterogeneity, these analyses iden-
tified ER and grade as the two features that most often
contributed to the observed heterogeneity (45 and 33
variants had marker-specific p < 0.05 for ER and grade,
respectively), and 29 variants were associated with more
than one tumor feature (Figs. 1, 2, Additional file 1:
Fig.S1). Eighteen of these 85 variants showed no asso-
ciations with any individual tumor marker at p < 0.05
(Fig. 2, Additional file 1: Fig. S1). Twenty-one variants
were associated at p < 0.05 only with ER, 12 variants
only with grade, four variants only with PR and one vari-
ant only with HER2 (Fig.2, Additional file1: Fig.S1, see
footnotes).
Estimation ofcase–control ORs (95%CIs) byintrinsic‑like
subtypes (model 3)
Fixed-effects two-stage models for intrinsic-like sub-
types (model 3) were fitted for each of the 85 variants
with evidence of global heterogeneity to estimate ORs
(95% CIs) for risk associations with each subtype (case–
control comparisons). Additional file1: Fig.S2 shows a
summary of these analyses for the 85 variants, clustered
by case–control z-value of association between suscep-
tibility variants and breast cancer intrinsic-like subtypes,
and Additional file2: Fig.S3 shows forest plots for asso-
ciations with risk by tumor subtypes. Nearly all (83 of 85)
variants were associated with risk (p < 0.05) for at least
one luminal-like subtype, and approximately half (41 of
85) of the variants were associated with risk of at least
one non-luminal subtype, including 32 variants that were
associated with risk of TN disease (Fig. 1, Additional
file1: Fig.S2 footnote ‘h’). Ten variants were associated
with risk of all subtypes (Fig.1, Additional file1: Fig.S2
footnote ‘j’). Below we describe examples of groups of
HER2
PR
ER
Grade
Fig. 2 Heatmap of the z‑values from the fixed‑effects two‑stage polytomous model for marker‑specific heterogeneity tests (case‑case comparison
from model 2) for the association between each of the 173 breast cancer susceptibility variants and estrogen receptor (ER), progesterone receptor
(PR), human epidermal growth factor receptor 2 (HER2) or grade, adjusting for principal components and each tumor marker. Columns represent
individual variants. For more detailed information on the context of the figure, see Additional file 1: Fig. S1
Page 6 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
variants associated with different patterns of associations
with intrinsic subtypes (Fig.3 a-d).
Two variants in linkage disequilibrium (LD, r2 = 0.73)
at 10q26.13 (rs2981578 and rs35054928) and 16q12.1-
rs4784227 had the strongest evidence of association with
risk of luminal-like subtypes (Fig. 3a, Additional file 1:
Fig.S2). e two variants at 10q26.13 showed no evi-
dence ofassociations with TN subtypes, and a weaker
association with HER2-positive/non-luminal subtype.
In contrast, 16q12.1-rs4784227 was strongly associated
with risk of all luminal-like subtypes and, weaker so, with
risk of HER2-positive/non-luminal and TN subtypes
(Figs.3a, Additional file1: Fig.S2).
ree variants 19p13.11-rs67397200, 5p15.33-
rs10069690 and 1q32.11-rs4245739 showed the strongest
evidence of associations with risk of TN disease. All three
of these variants showed weaker or no evidence of associ-
ations with risk of the other subtypes (Fig.3b, Additional
file1: Fig.S2).
Luminal A-like Luminal B-like/HER2-negative luminal B-like/HER2-positiveHER2-positive/non-luminal Triple-Negative
aPer-minor allele odds ratio (95% confidence limits).
bModel 1, mixed-effects two-stage polytomous model testing for global heterogeneity according to estrogen receptor (ER), progesterone receptor (PR),
human epidermal growth factor receptor 2 (HER2) and grade
cPredicted target genes as reported in Fachal L, et al. Nature genetics 2020; 52 (1), 56-73
dLuminal A-like (ER+ and/or PR+, HER2-, grade 1 & 2); Luminal B-like/HER2-negative (ER+ and/or PR+, HER2-, grade 3); luminal B-like/HER2-positive (ER+
and/or PR+, HER2+); HER2-positive/non-luminal (ER- and PR-, HER2+), and triple-negative (ER-, PR-, HER2-)
Variant Intrinsic-like subtypesd
Predicted
target genecIntrinsic-like subtypesd
Variant
Predicted
target genec
Odds ratio and 95% CI Odds ratio and 95% CI
(a)
(c)
(b)
(d)
Fig. 3 Results from fixed‑effects two‑stage polytomous models for risk associationsa with intrinsic‑like subtypes (model 3) for variants with
evidence of heterogeneity by tumor markers in the two‑stage model (model1)b; panels show examples of variants (a) most strongly associated
with luminal‑like subtypes, (b) most strongly associated with TN subtypes, (c) associated with all subtypes with varying strengths of association, and
(d) associated with luminal A‑like and TN subtypes in different directions. See Additional file 1: Fig. S2 for more details
Page 7 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
Two variants in low LD (r2 = 0.17) at 6q25, rs9397437 and
rs3757322, and a third variant in 6q25, rs2747652, which
was not in LD (r2 < 0.01) with rs9397437 or rs3757322,
showed strong evidence of being associated with risk of
all subtypes. rs9397437 and rs3757322 were most strongly
associated with risk of TN disease. rs2747652 was most
strongly associated with risk of HER2-positive subtypes
(Figs.3c, Additional file1: Fig.S2).
Five variants were associated with risk of luminal A-like
disease in an opposite direction to their association with
risk of TN disease. 1q32.1-rs6678914, 2p23.2-rs4577244,
and 19p13.11-rs67397200 had weaker evidence of asso-
ciations with risk of luminal A-like disease compared
to associations with risk of TN disease, and 10p12.31-
rs7072776 and 22q12.1-rs17879961 (I157T) had stronger
evidence of an association with risk of luminal A-like
disease compared to their association with risk of TN
disease (Fig.3d, Additional file1: Fig.S2, for rs67397200
see Fig.3b).
Estimation ofcase–control ORs (95%CIs) bytumor grade
(model 4)
Case–control associations by tumor grade for the 12 var-
iants that were observed associated at p < 0.05 only with
grade in case-case comparisons are shown in Additional
file 2: Fig. S4. 13q13.1-rs11571833, 1p22.3-rs17426269
and 11q24.3-rs11820646 showed stronger evidence for
predisposing to risk of high-grade subtypes, and the
remaining variants showed stronger evidence for predis-
posing to risk of low-grade subtypes.
When limiting analyses to cases with intrinsic-like sub-
types defined only by available tumor marker data, results
from case–control analyses were similar, but less precise
than results from the two-stage polytomous regression
model using the EM algorithm to account for missing
tumor marker data (Additional file1: TableS4).
Discussion
is study demonstrates the extent and complexity of
genetic etiologic heterogeneity among 173 breast can-
cer risk variants by multiple tumor characteristics, using
novel methodology in the largest and the most com-
prehensive investigation conducted to date. We found
compelling evidence that about half of the investigated
breast cancer susceptibility loci (85 of 173 variants) pre-
dispose to tumors with different characteristics. We iden-
tified tumor grade, along with confirming ER status, as
important determinants of etiologic heterogeneity. Asso-
ciations with individual tumor features translated into
differential associations with the risk of intrinsic-like sub-
types defined by their combinations.
Many of the variants with evidence of global hetero-
geneity predisposed to risk of multiple subtypes, but
with different magnitudes. For example, three vari-
ants identified in early GWAS for overall breast cancer,
FGFR2 (rs35054928 and rs2981578)[22, 23] and 8q24.21
(rs13281615)[22], were associated with luminal-like
and HER2-positive/non-luminal subtypes, but not with
TN disease. rs4784227 located near TOX3[22, 24] and
rs62355902 located in a MAP3K1[22] regulatory ele-
ment, were associated with risk of all five subtypes. Of
the five variants found associated in opposite direc-
tions with luminal A-like and TN disease, we previously
reported rs6678914 and rs4577244 to have opposite
effects between ER-negative and ER-positive tumors[7].
rs17879961 (I157T), a likely causal[16] missense variant
located in a CHEK2 functional domain that reduces or
abolishes substrate binding[25], was previously reported
to have opposite directions of effects on lung adeno-
carcinoma and lung squamous cell carcinoma and for
Table 1 Distribution of estrogen receptor (ER), progesterone
receptor (PR), human epidermal growth factor receptor 2 (HER2),
and grade and the intrinsic‑like subtypes for cases of invasive
breast cancer in studies from the Breast Cancer Consortium
Association
Luminal A-like (ER + and/or PR + , HER2-, grade 1 & 2); Luminal B-like/HER2-
negative (ER + and/or PR + , HER2-, grade 3); Luminal B-like/HER2-positive
(ER + and/or PR + , HER2 +); HER2-positive/non-luminal (ER- and PR-, HER2 +),
and triple-negative (ER-, PR-, HER2-)
Tumor marker N (%)
ER
Negative 16,900 (19%)
Positive 70,030 (81%)
Unknown 19,641
PR
Negative 24,283 (32%)
Positive 51,603 (68%)
Unknown 30,685
HER2
Negative 47,693 (83%)
Positive 9,529 (17%)
Unknown 49,349
Grade
1 15,583 (20%)
2 37,568 (49%)
3 24,382 (31%)
Unknown 29,038
Intrinsic‑like subtypes
Luminal A‑like 27,510 (54%)
Luminal B‑like/HER2‑negative 6,804 (13%)
Luminal B‑like/HER2‑positive 6,511 (13%)
HER2‑positive/non‑luminal 2,797 (6%)
Triple‑negative 7,178 (14%)
Unknown 55,771
Page 8 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
lung cancer between smokers and non-smokers[26, 27].
Moreover, the risk association of rs17879961 has been
reported to vary across tissue locations/cell-types, as this
variant has been associated with a higher risk of pancre-
atic ductal adenocarcinoma [28], chronic lymphocytic
leukemia [29], and colorectal cancer [30], and also asso-
ciated with a lower risk of aerodigestive squamous cell
carcinoma [31] and ovarian cancer [32]. To our knowl-
edge, rs67397200 and rs7072776 have not previously
been shown to be associated with subtypes in opposite
directions. In a prior breast cancer GWAS that applied
the two-stage polytomous model for risk variant discov-
ery, we also identified five variants associated with risk
of luminal A-like and TN disease in opposite directions
[15]. Overall, these findings suggest that the same biolog-
ical pathway has opposite effects on the susceptibility to
different tumor types. is interpretation is supported by
functional characterization of rs36115365, a variant on
5p15.33, which was found to have similar cis-regulatory
effects on TERT in multiple cancers cell lines from dif-
ferent cancers, but was associated with a higher risk of
pancreatic and testicular cancer and a lower risk of lung
cancer [33]. Alternatively, a causal variant may differently
influence cis-gene regulation and/or alter different bio-
logical pathways depending on the cell or tissue of ori-
gin [34]. Further studies of these variants are required
to clarify the biological mechanisms for these apparent
cross-over effects.
In prior ER-negative GWAS, we identified 20 vari-
ants that predispose to ER-negative disease, of which
five variants were only or most strongly associated with
risk of TN disease (rs4245739, rs10069690, rs74911261,
rs11374964, and rs67397200)[7, 8]. We confirmed these
five variants to be most strongly associated with TN
disease. e remaining previously identified 15 variants
all showed associations with risk of non-luminal sub-
types, especially TN disease, and for all but four variants
(rs17350191, rs200648189, rs6569648, and rs322144),
evidence of global heterogeneity was observed.
Little is known regarding PR and HER2 as sources of
etiologic heterogeneity independent of ER status. Of the
four variants that showed evidence of heterogeneity only
according to PR, rs10759243[6, 35], rs11199914[36] and
rs72749841[6] were previously found primarily associ-
ated with risk of ER-positive disease, and rs10816625
was found to be associated with risk of ER-positive/PR-
positive tumors, but not other ER/PR combinations[12].
rs10995201 was the only variant found in case-case
comparisons to be solely associated with HER2 status,
although the evidence was not strong, requiring fur-
ther confirmation. Previously, rs10995201 showed no
evidence of being associated with ER status[37]. Most
variants associated with PR or HER2, had not been
investigated for PR or HER2 heterogeneity while adjust-
ing for ER[913]. We previously reported rs10941679
to be associated with PR-status, independent of ER, and
also with grade[10]. We also found suggestive evidence
of PR-specific heterogeneity for 16q12-rs3803662[13],
which is in high LD (r2 = 0.78) with rs4784227 (TOX3),
a variant strongly associated with PR status. Our find-
ings for rs2747652 are also consistent with a prior BCAC
fine-mapping analysis across the ESR1 locus, which
found rs2747652 to be associated with risk of the HER2-
positive/non-luminal subtype and high grade independ-
ent of ER[9]. rs2747652 overlaps an enhancer region
and is associated with reduced ESR1 and CCDC170
expression[9].
Histologic grade is a composite of multiple tumor char-
acteristics, including mitotic count, nuclear pleomor-
phism, and degree of tubule or gland formation, therefore
susceptibility variants associated with tumor grade could
affect multiple biological pathways [38]. Evidence from
comparisons of tumor morphology and genomic and
molecular alterations suggest that tumor grade is likely
a ‘stable’ tumor feature and does not progress from low-
to high-grade [3942], thus the variants associated with
grade are likely not associated with grade progression.
Among the 12 variants identified with evidence of het-
erogeneity by grade only, rs17426269, rs11820646, and
rs11571833 were most strongly associated with risk of
grade 3 disease. rs11571833 lies in the BRCA2 coding
region and produces a truncated form of the protein[43]
and has been shown to be associated with both risk of TN
disease and risk of serous ovarian tumors, both of which
tend to be high-grade[44]. To our knowledge, rs17426269
and rs11820646 have not been investigated in relation to
grade heterogeneity. e remaining nine variants were all
more strongly associated with grade 1 or grade 2 disease.
Six of these variants were previously reported to be asso-
ciated primarily with ER-positive disease[6, 36, 45, 46],
highlighting the importance of accounting for multiple
tumor characteristics to better illuminate heterogeneity
sources.
We identified 18 variants with evidence of global het-
erogeneity (FDR < 5%), but no significant (marker-specific
p < 0.05) associations with any of the individual tumor
characteristic(s). is is likely explained by the fact that
the test for association with specific tumor markers using
fixed-effects models is less powerful than mixed-effects
models used to test the primary hypothesis of global het-
erogeneity by any tumor marker[14].
To help describe and visualize the strength of the evi-
dence for common heterogeneity patterns, we performed
clustered analyses of z-values for tumor marker-specific
heterogeneity tests and case–control associations with
risk of intrinsic-like subtypes. Because they are based on
Page 9 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
z-values, these clusters reflect differences in sample size
and statistical power to detect associations between vari-
ants and specific tumor subtypes. us, clusters should
not be interpreted as strictly defined categories.
A major strength of our study is our large sample size
of over 100,000 breast cancer cases with tumor marker
information, and a similar number of controls, making
this the largest, most comprehensive breast cancer het-
erogeneity investigation. Our application of the two-stage
polytomous logistic regression enabled adjusting for
multiple, correlated tumor markers and accounting for
missing tumor marker data. is is a more powerful and
efficient modeling strategy for identifying heterogeneity
sources among highly correlated tumor markers, com-
pared with standard polytomous logistic regression[14,
15]. In simulated and real data analyses, we have demon-
strated that in the presence of heterogenous associations
across subtypes, the two-stage model is more powerful
than polytomous logistic regression for detecting het-
erogeneity. Moreover, we have demonstrated that in the
presence of correlated markers, the two-stage model,
incorporating all markers simultaneously, has a much
better ability to distinguish the true source(s) of hetero-
geneity thantesting for heterogeneity by analyzingone
marker at a time[14, 15]. In prior analyses, we showed
that the two-stage polytomous regression is a power-
ful approach to identify susceptibility variants that dis-
play tumor heterogeneity[15]. Notably, in this prior
investigation we excluded the genomic regions in which
the 173 variants that were investigated in this work are
located[15].
Our study also has some limitations. First, many
breast cancer cases from studies included in this report
had missing information on one or more tumor char-
acteristics. ER tumor status data was available for 81%
of cases, but missing data for the other tumor markers
ranged from 27 to 46%. To address this limitation, we
implemented an EM algorithm that allowed a powerful
analysis to incorporate cases with missing tumor charac-
teristics under the assumption that tumor characteristics
are missing at random (MAR), i.e., the underlying reason
for missing data may depend on observed tumor mark-
ers or/and covariate values, but not on the missing val-
ues themselves[47]. If this assumption is violated it can
lead to an inflated type-one error[14]. However, in the
context of genetic association testing, the missingness
mechanism would also need to be related to the genetic
variants under study, which is unlikely. e 88 variants
that did not meet the p-value threshold for significant
heterogeneity in the global test, are likely to represent a
combination of variants that are associated with risk of
all investigated tumor subtypes with similar effects and
variants for which we lacked power to detect evidence of
global heterogeneity due to weak effect sizes or uncom-
mon allele frequencies. In addition, our study focused on
investigating ER, PR, HER2, and grade as heterogeneity
sources; future studies with more detailed tumor charac-
terization could reveal additional etiologic heterogeneity
sources.
Conclusion
Our findings provide insights into the complex etiologic
heterogeneity patterns of common breast cancer suscep-
tibility loci. ese findings may inform future studies,
such as fine-mapping and functional analyses to iden-
tify the underlying causal variants, clarifying biological
mechanisms that drive genetic predisposition to breast
cancer subtypes. Moreover, these analyses provide pre-
cise relative risk estimatesfor different intrinsic-like sub-
types that could improve the discriminatory accuracy of
subtype-specific polygenic risk scores [48].
Abbreviations
GWAS: Genome‑wide association studies; ER: Estrogen receptor; PR: Progester‑
one receptor; HER2: Human epidermal growth factor receptor 2; SNPs: Single
nucleotide polymorphisms; FDR: False discovery rate; TN: Triple‑negative;
BCAC : Breast Cancer Association Consortium; EM: Expectation–maximization;
OR: Odd ratios; 95% CI: 95% Confidence interval; LD: Linkage disequilibrium.
Supplementary Information
The online version contains supplementary material available at https:// doi.
org/ 10. 1186/ s13058‑ 021‑ 01484‑x.
Additional le1. Figures S1 and S2 and Table S1S4. This file contains
supplementary figures 1‑2 and supplementary tables 1‑4.
Additional le2. Figures S3 and S4. This file contains supplementary
figures S3 and S4.
Additional le3. Methods. This file contains the supplementary
methods.
Additional le4. Funding and Acknowledgement. This file contains the
additional funding not included in the main text, the acknowledgments,
and the names of the people in the collaboration groups.
Acknowledgements
A full description of the acknowledgments is provided in the Additional file 4:
Funding and Acknowledgement. NBCS Collaborators: greal@rr‑research.no.
ABCTB Investigators: mythily.sachchithananthan@sydney.edu.au. kConFab/
AOCS Investigators: heather.thorne@petermac.org
Authors’ contributions
Writing group: TUA, HZ, KMI, RLM, FJC, JSi, PKr, DFE, PDPP, MKS, MG‑C, NCh;
Statistical analysis: HZ, TUA, MG‑C, NCh; Provision of DNA samples and/or phe‑
notypic data: KMi, RLM, MKB, JDen, AMD, MLus, QW, ILA, HA‑C, VA, KJA, PLA,
AAu, AB, HBec, SBe, JBen, MBerm, CBl, SEB, BBon, A‑LB‑BD, HBra, HBre, AB‑W,
TB, BBur, SSB, FC, JEC, JC‑C, SJC, GC‑T, CLC, NBCS, JMC, ACox, SSC, KCz, MBD,
PD, TD, MDw, DME, DGE, PAF, JFi, GFl, MG‑D, SMG, JAG‑S, MMG, GGG, MSG,
AG–N, GIG, MGrip, PGu, CAH, PHall, UH, EFH, BAMH‑G, BHo, AHol, MJH, RNH,
JLHo, AHow, ABCTB, kConFab/AOCS, MJa, AJak, EMJ, MEJ, AJu, RKa, SKaup, RKe,
EKh, CMKi, Y‑DK, SKou, VNK, UK, KK‑S, AWK, KKy, DLa, DGL, ALin, MLin, JLis, AL,
W‑LL, RJM, AMan, MMan, SMar, MEM, CMc, AMe, UMe, HNe, WGN, JNo, KOf,
HO, NO, T‑WP‑S, AVP, JPet, GPi, DPK, RP, KPu, KPy, PRa, GR, ARo, TRü, ES, SS, DPS,
EJS, RKS, MJS, BSch, MES, X‑OS, SSm, MCS, JJS, AJS, RMT, WJT, JAT, LRT, MBT, DT,
Page 10 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
MAT, CMV, CHMVD, EMVV, PWa, CRW, CWe, JWe, RWi, AW, XRY, WZ, FJC, JSi, PKr,
DFE, PDPP, MKS, MG‑C. All authors read and approved the final version of the
manuscript.
Funding
Open Access funding provided by the National Institutes of Health (NIH)
This project has been funded in part with Federal funds from the National
Cancer Institute Intramural Research Program, National Institutes of Health. Dr.
Nilanjan Chatterjee was supported by NHGRI (1R01 HG010480‑01). Dr. Haoyu
Zhang was supported by National Cancer Institute (1K99 CA256513). OncoAr‑
ray genotyping was funded by the government of Canada through Genome
Canada and the Canadian Institutes of Health Research (GPH‑129344), the
Ministère de l’Économie, de la Science et de l’Innovation du Québec through
Génome Québec, the Quebec Breast Cancer Foundation for the PERSPEC‑
TIVE project, the US National Institutes of Health (NIH) (1 U19 CA 148065 for
the Discovery, Biology and Risk of Inherited Variants in Breast Cancer (DRIVE)
project and X01HG007492 to the Center for Inherited Disease Research (CIDR)
under contract HHSN268201200008I), Cancer Research UK (C1287/A16563),
the Odense University Hospital Research Foundation (Denmark), the National
R&D Program for Cancer Control–Ministry of Health and Welfare (Republic of
Korea) (1420190), the Italian Association for Cancer Research (AIRC; IG16933),
the Breast Cancer Research Foundation, the National Health and Medical
Research Council (Australia) and German Cancer Aid (110837). iCOGS geno‑
typing was funded by the European Union (HEALTH‑F2‑2009–223175), Cancer
Research UK (C1287/A10710, C1287/A10118 and C12292/A11174]), NIH grants
(CA128978, CA116167 and CA176785) and the Post‑Cancer GWAS initiative
(1U19 CA148537, 1U19 CA148065 and 1U19 CA148112 (GAME‑ON initia‑
tive)), an NCI Specialized Program of Research Excellence (SPORE) in Breast
Cancer (CA116201), the Canadian Institutes of Health Research (CIHR) for the
CIHR Team in Familial Risks of Breast Cancer, the Ministère de l’Économie,
Innovation et Exportation du Québec (PSR‑SIIRI‑701), the Komen Foundation
for the Cure, the Breast Cancer Research Foundation and the Ovarian Cancer
Research Fund. A full description of the funding is provided in the Additional
file 4: Funding and Acknowledgement.
Availability of data and materials
The datasets generated and/or analyzed during the current study are part of
the Breast Cancer Association Consortium and would be available with the
appropriate permissions, including an application process and appropriate
data transfer agreements.
Declarations
Ethics approval and consent to participate
All the studies included in these analyses were approved by local IRBs.
Consent for publication
Not applicable.
Competing interests
The authors have no competing interests to declare.
Author details
1 Division of Cancer Epidemiology and GeneticsDepartment of Health
and Human Services, Medical Center Drive, National Cancer Institute, National
Institutes of Health, Rockville, MD, USA. 2 Department of Biostatistics, Johns
Hopkins Bloomberg School of Public Health, Baltimore, MD, USA. 3 Institute
of Neurology & Genetics, Biostatistics Unit, Nicosia, Cyprus. 4 Centre for Cancer
Genetic Epidemiology, Department of Public Health and Primary Care,
University of Cambridge, Cambridge, UK. 5 Cyprus School of Molecular
Medicine, Institute of Neurology & Genetics, Nicosia, Cyprus. 6 Cancer
Epidemiology Division, Cancer Council Victoria, Melbourne, VIC, Australia.
7 Centre for Epidemiology and Biostatistics, Melbourne School of Population
and Global Health, The University of Melbourne, Melbourne, VIC, Australia.
8 Precision Medicine, School of Clinical Sciences at Monash Health, Monash
University, Clayton, VIC, Australia. 9 Centre for Cancer Genetic Epidemiology,
Department of Oncology, University of Cambridge, Cambridge, UK. 10 Fred A.
Litwin Center for Cancer Genetics, Lunenfeld‑Tanenbaum Research Institute
of Mount Sinai Hospital, Toronto, ON, Canada. 11 Department of Molecular
Genetics, University of Toronto, Toronto, ON, Canada. 12 Department
of Medicine, Genetic Epidemiology Research Institute, University of California
Irvine, Irvine, CA, USA. 13 Division of Clinical Epidemiology and Aging Research,
German Cancer Research Center (DKFZ), Heidelberg, Germany. 14 Department
of Public Health Sciences, and Cancer Research Institute, Queen’s University,
Kingston, ON, Canada. 15 Cancer Prevention Program, Fred Hutchinson Cancer
Research Center, Seattle, WA, USA. 16 Zilber School of Public Health, University
of Wisconsin‑Milwaukee, Milwaukee, WI, USA. 17 Department of Cancer
Epidemiology, Clinical Sciences, Lund University, Lund, Sweden. 18 Leuven
Multidisciplinary Breast Center, Department of Oncology, Leuven Cancer
Institute, University Hospitals Leuven, Leuven, Belgium. 19 Institute of Medical
Biometry and Epidemiology, University Medical Center Hamburg‑Eppendorf,
Hamburg, Germany. 20 Division of Cancer Epidemiology, German Cancer
Research Center (DKFZ), Heidelberg, Germany. 21 Human Cancer Genetics
Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
22 Biomedical Network On Rare Diseases (CIBERER), Madrid, Spain. 23 Institute
of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian
Academy of Sciences, Ufa, Russia. 24 Saint Petersburg State University,
Saint‑Petersburg, Russia. 25 Department of Oncology, Helsinki University
Hospital, University of Helsinki, Helsinki, Finland. 26 Department of Oncology,
Örebro University Hospital, Örebro, Sweden. 27 Faculty of Health and Medical
Sciences, University of Copenhagen, Copenhagen, Denmark. 28 Department
of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University
Hospital, Herlev, Denmark. 29 Copenhagen General Population Study, Herlev
and Gentofte Hospital, Copenhagen University Hospital, Herlev, Denmark.
30 Division of Cancer Prevention and Genetics, IEO, European Institute
of Oncology IRCCS, Milan, Italy. 31 Department of Cancer Genetics, Institute
for Cancer Research, Oslo University Hospital‑Radiumhospitalet, Oslo, Norway.
32 Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo,
Norway. 33 Dr. Margarete Fischer‑Bosch‑Institute of Clinical Pharmacology,
Stuttgart, Germany. 34 iFIT‑Cluster of Excellence, University of Tübingen,
Tübingen, Germany. 35 German Cancer Consortium (DKTK ), German Cancer
Research Center (DKFZ), Partner Site Tübingen, Tübingen, Germany. 36 German
Cancer Consortium (DKTK), German Cancer Research Center (DKFZ),
Heidelberg, Germany. 37 Division of Preventive Oncology, German Cancer
Research Center (DKFZ), National Center for Tumor Diseases (NCT), Heidelberg,
Germany. 38 Genome Sciences Centre, BC Cancer Agency, Vancouver, BC,
Canada. 39 Department of Biomedical Physiology and Kinesiology, Simon
Fraser University, Burnaby, BC, Canada. 40 Institute for Prevention and Occupa‑
tional Medicine of the German Social Accident Insurance, Institute, Ruhr
University Bochum (IPA), Bochum, Germany. 41 Molecular Epidemiology Group,
German Cancer Research Center (DKFZ), C080 Heidelberg, Germany.
42 Molecular Biology of Breast Cancer, University Womens Clinic Heidelberg,
University of Heidelberg, Heidelberg, Germany. 43 Department of Medicine,
Huntsman Cancer Institute, Salt Lake City, UT, USA. 44 Genomic Epidemiology
Group, German Cancer Research Center (DKFZ), Heidelberg, Germany.
45 Oncology and Genetics Unit, Instituto de Investigacion Sanitaria Galicia Sur
(IISGS), Xerencia de Xestion Integrada de Vigo‑SERGAS, Vigo, Spain. 46 Cancer
Epidemiology Group, University Cancer Center Hamburg (UCCH), University
Medical Center Hamburg‑Eppendorf, Hamburg, Germany. 47 Department
of Genetics and Computational Biology, QIMR Berghofer Medical Research
Institute, Brisbane, QLD, Australia. 48 Westmead Institute for Medical Research,
University of Sydney, Sydney, NSW, Australia. 49 Department of Clinical
Genetics, Erasmus University Medical Center, Rotterdam, The Netherlands.
50 Department of Oncology and Metabolism, Sheffield Institute for Nucleic
Acids (SInFoNiA), University of Sheffield, Sheffield, UK. 51 Department
of Neuroscience, Academic Unit of Pathology, University of Sheffield, Sheffield,
UK. 52 Department of Medical Epidemiology and Biostatistics, Karolinska
Institutet, Stockholm, Sweden. 53 Department of Clinical Genetics, Fox Chase
Cancer Center, Philadelphia, PA, USA. 54 Department of Human Genetics,
Leiden University Medical Center, Leiden, The Netherlands. 55 Department
of Pathology, Leiden University Medical Center, Leiden, The Netherlands.
56 Gynaecology Research Unit, Hannover Medical School, Hannover, Germany.
57 School of Life Sciences, University of Westminster, London, UK. 58 Faculty
of Medicine, University of Southampton, Southampton, UK. 59 North West
Genomics Laboratory Hub, Manchester Centre for Genomic Medicine, St
Mary’s Hospital, Manchester University NHS Foundation Trust, Manchester
Academic Health Science Centre, Manchester, UK. 60 Division of Evolution
and Genomic Sciences, School of Biological Sciences, Faculty of Biology,
Medicine and Health, University of Manchester, Manchester Academic Health
Science Centre, Manchester, UK. 61 Department of Gynecology and Obstetrics
Comprehensive Cancer Center Erlangen‑EMN, Friedrich‑Alexander University
Page 11 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
Erlangen‑Nuremberg, University Hospital Erlangen, Erlangen, Germany.
62 Usher Institute of Population Health Sciences and Informatics, The University
of Edinburgh, Edinburgh, UK. 63 Cancer Research UK Edinburgh Centre, The
University of Edinburgh, Edinburgh, UK. 64 Fundación Pública Galega de
Medicina Xenómica, Instituto de Investigación Sanitaria de Santiago de
Compostela (IDIS), Complejo Hospitalario Universitario de Santiago, SERGAS,
Santiago de Compostela, Spain. 65 Moores Cancer Center, University
of California San Diego, La Jolla, CA, USA. 66 Behavioral and Epidemiology
Research Group, American Cancer Society, Atlanta, GA, USA. 67 Medical
Oncology Department, Centro Investigación Biomédica en Red de Cáncer
(CIBERONC), Hospital Clínico San Carlos, Instituto de Investigación Sanitaria
San Carlos (IdISSC), Madrid, Spain. 68 Division of Clinical Epidemiology, Royal
Victoria Hospital, McGill University, Montréal, QC, Canada. 69 Department
of Medicine, McGill University, Montréal, QC, Canada. 70 Department of Surgery,
Oulu University Hospital, University of Oulu, Oulu, Finland. 71 Center
for Research in Epidemiology and Population Health (CESP), Team Exposome
and Heredity, INSERM, University Paris‑Saclay, Villejuif, France. 72 Department
of Preventive Medicine, Keck School of Medicine, University of Southern
California, Los Angeles, CA, USA. 73 Department of Oncology, Södersjukhuset,
Stockholm, Sweden. 74 Molecular Genetics of Breast Cancer, German Cancer
Research Center (DKFZ), Heidelberg, Germany. 75 Division of Informatics,
Imaging and Data Sciences, Faculty of Biology, Medicine and Health,
University of Manchester, Manchester Academic Health Science Centre,
Manchester, UK. 76 Nightingale & Genesis Prevention Centre, Wythenshawe
Hospital, Manchester University NHS Foundation Trust, Manchester, UK. 77 NIHR
Manchester Biomedical Research Unit, Manchester University NHS Foundation
Trust, Manchester Academic Health Science Centre, Manchester, UK.
78 Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam,
The Netherlands. 79 Saarland Cancer Registry, Saarbrücken, Germany. 80 Division
of Cancer Sciences, University of Manchester, Manchester, UK. 81 Research
Centre for Genetic Engineering and Biotechnology “Georgi D. Efremov”, MASA,
Skopje, Republic of North Macedonia. 82 Department of Genetics and Pathol‑
ogy, Pomeranian Medical University, Szczecin, Poland. 83 Independent
Laboratory of Molecular Biology and Genetic Diagnostics, Pomeranian Medical
University, Szczecin, Poland. 84 Department of Epidemiology & Population
Health, Stanford University School of Medicine, Stanford, CA, USA. 85 Depart‑
ment of Medicine, Division of Oncology, Stanford Cancer Institute, Stanford
University School of Medicine, Stanford, CA, USA. 86 Division of Genetics
and Epidemiology, The Institute of Cancer Research, London, UK. 87 Depart‑
ment of Pathology, Oulu University Hospital, University of Oulu, Oulu, Finland.
88 Division of Molecular Pathology, The Netherlands Cancer Institute ‑ Antoni
Van Leeuwenhoek Hospital, Amsterdam, The Netherlands. 89 Department
of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia.
90 Radiation Epidemiology Branch, Division of Cancer Epidemiology
and Genetics, National Cancer Institute, Bethesda, MD, USA. 91 Department
of Internal Medicine, Johanniter Kliniken Bonn, Johanniter Krankenhaus, Bonn,
Germany. 92 Department of Medical Genetics, Oslo University Hospital
and University of Oslo, Oslo, Norway. 93 Department of Histopathology
and Cytology, Clinical Hospital Acibadem Sistina, Skopje, Republic of North
Macedonia. 94 Cancer Genetics, Therapeutics and Ultrastructural Pathology, The
Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus. 95 Laboratory
for Translational Genetics, Department of Human Genetics, University
of Leuven, Leuven, Belgium. 96 VIB Center for Cancer Biology, Leuven, Belgium.
97 Cancer Control Research, BC Cancer, Vancouver, BC, Canada. 98 Department
of Mathematics and Statistics, St. Francis Xavier University, Antigonish, NS,
Canada. 99 Department of Molecular Medicine and Surgery, Karolinska
Institutet, Stockholm, Sweden. 100 Department of Clinical Genetics, Karolinska
University Hospital, Stockholm, Sweden. 101 Department of Cancer Epidemiol‑
ogy and Prevention, M. Sklodowska‑Curie National Research Institute
of Oncology, Warsaw, Poland. 102 General and Gastroenterology Surgery
Service, Hospital Universitario Central de Asturias, Oviedo, Spain. 103 University
of Tübingen, Tübingen, Germany. 104 Institute of Clinical Medicine, Pathology
and Forensic Medicine, University of Eastern Finland, Kuopio, Finland.
105 Translational Cancer Research Area, University of Eastern Finland, Kuopio,
Finland. 106 Biobank of Eastern Finland, Kuopio University Hospital, Kuopio,
Finland. 107 Department of Clinical Science and Education, K arolinska Institutet,
Södersjukhuset Stockholm, Sweden. 108 Anatomical Pathology, The Alfred
Hospital, Melbourne, VIC, Australia. 109 Department of Gynecology and Obstet‑
rics, University of Munich, Campus Großhadern, Munich, Germany. 110 Institute
of Clinical Trials & Methodology, University College London, London, UK.
111 Department of Obstetrics and Gynecology, Helsinki University Hospital,
University of Helsinki, Helsinki, Finland. 112 Herbert Wertheim School of Public
Health and Human Longevity Science, University of California San Diego, La
Jolla, CA, USA. 113 Clinical Genetics Research Lab, Department of Cancer
Biology and Genetics, Memorial Sloan Kettering Cancer Center, New York, NY,
USA. 114 Centre for Cancer Research and Cell Biology, Queen’s University Belfast,
Belfast, Ireland, UK. 115 Department of Non‑Communicable Disease Epidemiol‑
ogy, School of Hygiene and Tropical Medicine, London, UK. 116 Human
Genotyping‑CEGEN Unit, Human Cancer Genetic Program, Spanish National
Cancer Research Centre, Madrid, Spain. 117 Department of General Medical
Oncology and Multidisciplinary Breast Center, Leuven Cancer Institute,
University Hospitals Leuven, Leuven, Belgium. 118 Laboratory of Cancer
Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit,
University of Oulu, Biocenter Oulu, Oulu, Finland. 119 Laboratory of Cancer
Genetics and Tumor Biology, Northern Finland Laboratory Centre Oulu, Oulu,
Finland. 120 Unit of Molecular Bases of Genetic Risk and Genetic Testing,
Department of Research, Fondazione IRCCS Istituto Nazionale Dei Tumori
(INT), Milan, Italy. 121 Technion Faculty of Medicine, Clalit National Cancer
Control Center, Carmel Medical Center, Haifa, Israel. 122 Medical Oncology
Department, Hospital Universitario Puerta de Hierro, Madrid, Spain. 123 Institute
of Pathology, Staedtisches Klinikum Karlsruhe, Karlsruhe, Germany. 124 Depart‑
ment of Oncology, University Hospital of Larissa, Larissa, Greece. 125 Prevent
Breast Cancer Centre and Nightingale Breast Screening Centre, Manchester
University NHS Foundation Trust, Manchester, UK. 126 Epidemiology Branch,
National Institute of Environmental Health Sciences, NIH, Research Triangle
Park, NC, USA. 127 School of Cancer & Pharmaceutical Sciences, Comprehensive
Cancer Centre, Guy’s Campus, King’s College London, London, UK. 128 Center
for Integrated Oncology (CIO), Faculty of Medicine, University Hospital
Cologne, University of Cologne, Cologne, Germany. 129 Center for Molecular
Medicine Cologne (CMMC), Faculty of Medicine, University Hospital Cologne,
University of Cologne, Cologne, Germany. 130 Center for Familial Breast
and Ovarian Cancer, Faculty of Medicine, University Hospital Cologne,
University of Cologne, Cologne, Germany. 131 Network Aging Research,
University of Heidelberg, Heidelberg, Germany. 132 Department of Health
Sciences Research, Mayo Clinic College of Medicine, Jacksonville, FL, USA.
133 Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiol‑
ogy Center, Vanderbilt‑Ingram Cancer Center, Vanderbilt University School
of Medicine, Nashville, TN, USA. 134 Medical Faculty, Ss. Cyril and Methodius
University in Skopje, University Clinic of Radiotherapy and Oncology, Skopje,
Republic of North Macedonia. 135 Department of Clinical Pathology, The
University of Melbourne, Melbourne, VIC, Australia. 136 Population Oncology,
BC Cancer, Vancouver, BC, Canada. 137 School of Population and Public Health,
University of British Columbia, Vancouver, BC, Canada. 138 Division of Breast
Cancer Research, The Institute of Cancer Research, London, UK. 139 Department
of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
140 Epigenetic and Stem Cell Biology Laboratory, National Institute of Environ‑
mental Health Sciences, NIH, Research Triangle Park, NC, USA. 141 Department
of Epidemiology, Mailman School of Public Health, Columbia University, New
York, NY, USA. 142 Institute of Human Genetics, Pontificia Universidad Javeriana,
Bogota, Colombia. 143 Department of Epidemiology, Gillings School of Global
Public Health and UNC Lineberger Comprehensive Cancer Center, University
of North Carolina at Chapel Hill, Chapel Hill, NC, USA. 144 Department of Health
Science Research, Division of Epidemiology, Mayo Clinic, Rochester, MN, USA.
145 Department of Pathology, Erasmus University Medical Center, Rotterdam,
The Netherlands. 146 Biostatistics and Computational Biology Branch, National
Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC,
USA. 147 Department of Pathology, The Netherlands Cancer Institute ‑ Antoni
Van Leeuwenhoek Hospital, Amsterdam, The Netherlands. 148 Institute
of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.
149 Department of Surgical Sciences, Uppsala University, Uppsala, Sweden.
150 Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester,
MN, USA. 151 Genomics Center, Department of Molecular Medicine, Centre
Hospitalier Universitaire de Québec, Université Laval Research Center,
Université Laval, Québec City, QC, Canada. 152 Program in Genetic Epidemiol‑
ogy and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston,
MA, USA. 153 Depar tment of Epidemiology, Harvard T.H. Chan School of Public
Health, Boston, MA, USA. 154 Division of Psychosocial Research and Epidemiol‑
ogy, The Netherlands Cancer Institute ‑ Antoni Van Leeuwenhoek Hospital,
Amsterdam, The Netherlands. 155 Department of Biostatistics, Bloomberg
School of Public Health, John Hopkins University, Baltimore, MD, USA.
156 Department of Oncology, School of Medicine, John Hopkins University,
Baltimore, MD, USA. 157 Department of Research, Vestre Viken Hospital,
Page 12 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
Drammen, Norway. 158 Institute of Clinical Medicine, Faculty of Medicine,
University of Oslo, Oslo, Norway. 159 Section for Breast‑ and Endocrine Surgery,
Department of Cancer, Division of Surgery, Cancer and Transplantation
Medicine, Oslo University Hospital‑Ullevål, Oslo, Norway. 160 Department
of Radiology and Nuclear Medicine, Oslo University Hospital, Oslo, Norway.
161 Department of Pathology, Akershus University Hospital, Lørenskog, Norway.
162 Department of Tumor Biology, Institute for Cancer Research, Oslo University
Hospital, Oslo, Norway. 163 Department of Oncology, Division of Surgery
and Cancer and Transplantation Medicine, Oslo University Hospital‑Radium‑
hospitalet, Oslo, Norway. 164 National Advisory Unit on Late Effects after Cancer
Treatment, Department of Oncology, Oslo University Hospital, Oslo, Norway.
165 Department of Oncology, Akershus University Hospital, Lørenskog, Norway.
166 OSBREAC (Breast Cancer Research Consortium (Chair: Kristine K. Sahlberg),
Oslo University Hospital, Oslo, Norway. 167 Westmead Institute for Medical
Research, University of Sydney, Sydney, NSW, Australia. 168 Pathology West
ICPMR, Westmead, NSW, Australia. 169 Kolling Institute of Medical Research,
University of Sydney, Royal North Shore Hospital, Sydney, NSW, Australia.
170 Pathology North, John Hunter Hospital, Newcastle, NSW 2305, Australia.
171 Westmead Institute for Medical Research, University of Sydney, Sydney,
NSW, Australia. 172 Department of Anatomical Pathology, ACT Pathology,
Canberra Hospital, Canberra, ACT , Australia. 173 ANU Medical School, Australian
National University, Canberra, ACT , Australia. 174 Department of Surgical
Oncology, Calvary Mater Newcastle Hospital, Australian New Zealand Breast
Cancer Trials Group, and School of Medicine and Public Health, University
of Newcastle, Newcastle, NSW, Australia. 175 School of Science and Health, The
University of Western Sydney, Sydney, Australia. 176 Hormones and Cancer
Group, Kolling Institute of Medical Research, Royal North Shore Hospital,
University of Sydney, Newcastle, NSW, Australia. 177 SydPath St Vincent ’s
Hospital, Sydney, NSW, Australia. 178 Department of Tissue Pathology
and Diagnostic Oncology, Pathology West, Westmead Breast Cancer Institute,
Westmead Hospital, Westmead, NSW, Australia. 179 Centre for Information
Based Medicine, Hunter Medical Research Institute, New Lambton Heights,
NSW 2305, Australia. 180 Priority Research Centre for Cancer, School of Biomedi‑
cal Sciences and Pharmacy, Faculty of Health, University of Newcastle,
Newcastle, NSW, Australia. 181 The University of Queensland: UQ Centre
for Clinical Research and School of Medicine, Brisbane, QLD, Australia.
182 Hereditary Cancer Clinic, St Vincent’s Hospital, The Kinghorn Cancer Centre,
Sydney, NSW 2010, Australia. 183 Crown Princess Mary Cancer Centre,
Westmead Hospital, Westmead, Australia. 184 Sydney Medical School
‑ Westmead, University of Sydney, Sydney, NSW, Australia. 185 Department
of Medical Oncology, The Canberra Hospital, Canberra, ACT , Australia. 186 St
John of God Perth Northern Hospitals, Perth, WA, Australia. 187 Peter MacCallum
Cancer Centre, Melbourne, Australia. 188 QIMR Berghofer Medical Research
Institute, Brisbane, Australia. 189 Westmead Institute for Medical Research,
Sydney, Australia. 190 BCNA delegate, Community Representative, Melbourne,
Australia. 191 Westmead Hospital, Sydney, Australia. 192 Walter and Eliza Hall
Institute, Melbourne, Australia. 193 University of Sydney, Sydney, Australia.
194 University of Melbourne, Melbourne, Australia. 195 Cancer Council Vic toria,
Melbourne, Australia. 196 Melbourne Health, Melbourne, Australia.
Received: 15 June 2021 Accepted: 2 November 2021
References
1. Cancer Genome Atlas N. Comprehensive molecular portraits of human
breast tumours. Nature. 2012;490(7418):61–70.
2. Curigliano G, Burstein HJ, Gnant M, Dubsky P, Loibl S, Colleoni M, Regan
MM, Piccart‑Gebhart M, Senn HJ et al: De‑escalating and escalating treat‑
ments for early‑stage breast cancer: the St Gallen International Expert
Consensus Conference on the Primary Therapy of Early Breast Cancer
2017. Ann Oncol 2017, 28(8):1700–1712.
3. Goldhirsch A, Winer EP, Coates AS, Gelber RD, Piccart‑Gebhart M, Thurli‑
mann B, Senn HJ. Panel m: personalizing the treatment of women with
early breast cancer: highlights of the St Gallen international expert con‑
sensus on the primary therapy of early breast cancer 2013. Ann Oncol.
2013;24(9):2206–23.
4. Barnard ME, Boeke CE, Tamimi RM. Established breast cancer risk
factors and risk of intrinsic tumor subtypes. Biochim Biophys Acta.
2015;1856(1):73–85.
5. Yang XR, Chang‑Claude J, Goode EL, Couch FJ, Nevanlinna H, Milne
RL, Gaudet M, Schmidt MK, Broeks A, Cox A, et al. Associations of
breast cancer risk factors with tumor subtypes: a pooled analysis from
the Breast Cancer Association Consortium studies. J Natl Cancer Inst.
2011;103(3):250–63.
6. Michailidou K, Lindstrom S, Dennis J, Beesley J, Hui S, Kar S, Lemacon A,
Soucy P, Glubb D, Rostamianfar A, et al. Association analysis identifies 65
new breast cancer risk loci. Nature. 2017;551(7678):92–4.
7. Milne RL, Kuchenbaecker KB, Michailidou K, Beesley J, Kar S, Lindstrom
S, Hui S, Lemacon A, Soucy P, Dennis J, et al. Identification of ten variants
associated with risk of estrogen‑receptor‑negative breast cancer. Nat
Genet. 2017;49(12):1767–78.
8. Garcia‑Closas M, Couch FJ, Lindstrom S, Michailidou K, Schmidt MK, Brook
MN, Orr N, Rhie SK, Riboli E, Feigelson HS, et al. Genome‑wide associa‑
tion studies identify four ER negative‑specific breast cancer risk loci. Nat
Genet. 2013;45(4):392–8.
9. Dunning AM, Michailidou K , Kuchenbaecker KB, Thompson D, French
JD, Beesley J, Healey CS, Kar S, Pooley KA, Lopez‑Knowles E, et al. Breast
cancer risk variants at 6q25 display different phenotype associations and
regulate ESR1, RMND1 and CCDC170. Nat Genet. 2016;48(4):374–86.
10. Milne RL, Goode EL, Garcia‑Closas M, Couch FJ, Severi G, Hein R, Frederick‑
sen Z, Malats N, Zamora MP, Arias Perez JI, et al. Confirmation of 5p12 as
a susceptibility locus for progesterone‑receptor‑positive, lower grade
breast cancer. Cancer Epidemiol Biomark Prevent. 2011;20(10):2222–31.
11. Figueroa JD, Garcia‑Closas M, Humphreys M, Platte R, Hopper JL, Southey
MC, Apicella C, Hammet F, Schmidt MK, Broeks A, et al. Associations of
common variants at 1p11.2 and 14q24.1 (RAD51L1) with breast cancer
risk and heterogeneity by tumor subtype: findings from the Breast Can‑
cer Association Consortium. Hum Mol Genet. 2011;20(23):4693–706.
12. Orr N, Dudbridge F, Dryden N, Maguire S, Novo D, Perrakis E, Johnson N,
Ghoussaini M, Hopper JL, Southey MC, et al. Fine‑mapping identifies two
additional breast cancer susceptibility loci at 9q312. Hum Mol Genet.
2015;24(10):2966–84.
13. Broeks A, Schmidt MK, Sherman ME, Couch FJ, Hopper JL, Dite GS,
Apicella C, Smith LD, Hammet F, Southey MC, et al. Low penetrance
breast cancer susceptibility loci are associated with specific breast tumor
subtypes: findings from the Breast Cancer Association Consortium. Hum
Mol Genet. 2011;20(16):3289–303.
14. Zhang H, Zhao N, Ahearn TU, Wheeler W, García‑Closas M, Chatterjee N: A
mixed‑model approach for powerful testing of genetic associations with
cancer risk incorporating tumor characteristics. Biostatistics 2020.
15. Zhang H, Ahearn TU, Lecarpentier J, Barnes D, Beesley J, Qi G, Jiang X,
O’Mara TA, Zhao N, Bolla MK, et al. Genome‑wide association study identi‑
fies 32 novel breast cancer susceptibility loci from overall and subtype‑
specific analyses. Nat Genet. 2020;52(6):572–81.
16. Fachal L, Aschard H, Beesley J, Barnes DR, Allen J, Kar S, Pooley KA, Dennis
J, Michailidou K, Turman C, et al. Fine‑mapping of 150 breast cancer risk
regions identifies 191 likely target genes. Nat Genet. 2020;52(1):56–73.
17. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete
data via em algorithm. J Roy Stat Soc B Met. 1977;39(1):1–38.
18. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical
and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Meth‑
odol). 1995;57(1):289–300.
19. Gu Z, Eils R, Schlesner M. Complex heatmaps reveal patterns and
correlations in multidimensional genomic data. Bioinformatics.
2016;32(18):2847–9.
20. DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer
A, Jemal A, Siegel RL. Breast cancer statistics, 2019. CA Cancer J Clin.
2019;69(6):438–51.
21. Mesa‑Eguiagaray I, Wild SH, Rosenberg PS, Bird SM, Brewster DH, Hall
PS, Cameron DA, Morrison D, Figueroa JD. Distinct temporal trends in
breast cancer incidence from 1997 to 2016 by molecular subtypes: a
population‑based study of Scottish cancer registry data. Br J Cancer.
2020;123(5):852–9.
22. Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger
DG, Struewing JP, Morrison J, Field H, Luben R, et al. Genome‑wide
association study identifies novel breast cancer susceptibility loci. Nature.
2007;447(7148):1087–93.
23. Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder
S, Wang Z, Welch R, Hutchinson A, et al. A genome‑wide association
Page 13 of 13
Ahearnetal. Breast Cancer Research (2022) 24:2
fast, convenient online submission
thorough peer review by experienced researchers in your field
rapid publication on acceptance
support for research data, including large and complex data types
gold Open Access which fosters wider collaboration and increased citations
maximum visibility for your research: over 100M website views per year
At BMC, research is always in progress.
Learn more biomedcentral.com/submissions
Ready to submit your research
Ready to submit your research
? Choose BMC and benefit from:
? Choose BMC and benefit from:
study identifies alleles in FGFR2 associated with risk of sporadic post‑
menopausal breast cancer. Nat Genet. 2007;39(7):870–4.
24. Stacey SN, Manolescu A, Sulem P, Rafnar T, Gudmundsson J, Gudjonsson
SA, Masson G, Jakobsdottir M, Thorlacius S, Helgason A, et al. Common
variants on chromosomes 2q35 and 16q12 confer susceptibility to estro‑
gen receptor‑positive breast cancer. Nat Genet. 2007;39(7):865–9.
25. Li J, Williams BL, Haire LF, Goldberg M, Wilker E, Durocher D, Yaffe MB,
Jackson SP, Smerdon SJ. Structural and functional versatility of the FHA
domain in DNA‑damage signaling by the tumor suppressor kinase Chk2.
Mol Cell. 2002;9(5):1045–54.
26. McKay JD, Hung RJ, Han Y, Zong X, Carreras‑Torres R, Christiani DC,
Caporaso NE, Johansson M, Xiao X, Li Y, et al. Large‑scale association
analysis identifies new lung cancer susceptibility loci and heterogene‑
ity in genetic susceptibility across histological subtypes. Nat Genet.
2017;49(7):1126–32.
27. Wang Y, McKay JD, Rafnar T, Wang Z, Timofeeva MN, Broderick P, Zong X,
Laplana M, Wei Y, Han Y, et al. Rare variants of large effect in BRCA2 and
CHEK2 affect risk of lung cancer. Nat Genet. 2014;46(7):736–41.
28. Obazee O, Archibugi L, Andriulli A, Soucek P, Malecka‑Panas E, Ivanaus‑
kas A, Johnson T, Gazouli M, Pausch T, Lawlor RT, et al. Germline BRCA2
K3326X and CHEK2 I157T mutations increase risk for sporadic pancreatic
ductal adenocarcinoma. Int J Cancer. 2019;145(3):686–93.
29. Rudd MF, Sellick GS, Webb EL, Catovsky D, Houlston RS. Variants in the
ATM‑BRCA2‑CHEK2 axis predispose to chronic lymphocytic leukemia.
Blood. 2006;108(2):638–44.
30. Liu C, Wang QS, Wang YJ. The CHEK2 I157T variant and colorectal cancer
susceptibility: a systematic review and meta‑analysis. Asian Pac J Cancer
Prev. 2012;13(5):2051–5.
31. Lesseur C, Ferreiro‑Iglesias A, McKay JD, Bosse Y, Johansson M, Gaborieau
V, Landi MT, Christiani DC, Caporaso NC, Bojesen SE et al: Genome‑wide
association meta‑analysis identifies pleiotropic risk loci for aerodigestive
squamous cell cancers. PLoS genetics 2021, 17(3):e1009254.
32. Ovarian Cancer Association Consortium ‑ Results lookup by
region [http:// ocac. ccge. medsc hl. cam. ac. uk/ data‑ proje cts/ resul
ts‑ lookup‑ by‑ region/]
33. Fang J, Jia J, Makowski M, Xu M, Wang Z, Zhang T, Hoskins JW, Choi J,
Han Y, Zhang M, et al. Functional characterization of a multi‑cancer risk
locus on chr5p1533 reveals regulation of TERT by ZNF148. Nat Commun.
2017;8(1):15034.
34. Kim‑Hellmuth S, Aguet F, Oliva M, Munoz‑Aguirre M, Kasela S, Wucher V,
Castel SE, Hamel AR, Vinuela A, Roberts AL et al: Cell type‑specific genetic
regulation of gene expression across human tissues. Science 2020,
369(6509).
35. Li X, Zou W, Liu M, Cao W, Jiang Y, An G, Wang Y, Huang S, Zhao X. Associa‑
tion of multiple genetic variants with breast cancer susceptibility in the
Han Chinese population. Oncotarget. 2016;7(51):85483–91.
36. Michailidou K, Hall P, Gonzalez‑Neira A, Ghoussaini M, Dennis J, Milne
RL, Schmidt MK, Chang‑Claude J, Bojesen SE, Bolla MK, et al. Large‑scale
genotyping identifies 41 new loci associated with breast cancer risk. Nat
Genet. 2013;45(4):353–61.
37. Darabi H, McCue K, Beesley J, Michailidou K , Nord S, Kar S, Humphreys K,
Thompson D, Ghoussaini M, Bolla MK, et al. Polymorphisms in a putative
enhancer at the 10q21.2 breast cancer risk locus regulate NRBF2 expres‑
sion. Am J Hum Genet. 2015;97(1):22–34.
38. Elston CW, Ellis IO. Pathological prognostic factors in breast cancer. I. The
value of histological grade in breast cancer: experience from a large study
with long‑term follow‑up. Histopathology. 1991;19(5):403–10.
39. Bombonati A, Sgroi DC. The molecular pathology of breast cancer pro‑
gression. J Pathol. 2011;223(2):307–17.
40. Schymik B, Buerger H, Kramer A, Voss U, van der Groep P, Meinerz W,
van Diest PJ, Korsching E. Is there “progression through grade” in ductal
invasive breast cancer? Breast Cancer Res Treat. 2012;135(3):693–703.
41. Roylance R, Gorman P, Harris W, Liebmann R, Barnes D, Hanby A, Sheer
D. Comparative genomic hybridization of breast tumors stratified by
histological grade reveals new insights into the biological progression of
breast cancer. Can Res. 1999;59(7):1433–6.
42. Rajakariar R, Walker RA. Pathological and biological features of mam‑
mographically detected invasive breast carcinomas. Br J Cancer.
1995;71(1):150–4.
43. Mazoyer S, Dunning AM, Serova O, Dearden J, Puget N, Healey CS,
Gayther SA, Mangion J, Stratton MR, Lynch HT, et al. A polymorphic stop
codon in BRCA2. Nat Genet. 1996;14(3):253–4.
44. Meeks HD, Song H, Michailidou K, Bolla MK, Dennis J, Wang Q, Barrowdale
D, Frost D, McGuffog L, Ellis S et al, BRCA2 Polymorphic Stop Codon
K3326X and the Risk of Breast, Prostate, and Ovarian Cancers. J Natl
Cancer Inst 2016, 108(2).
45. Darabi H, Beesley J, Droit A, Kar S, Nord S, Moradi Marjaneh M, Soucy P,
Michailidou K, Ghoussaini M, Fues Wahl H, et al. Fine scale mapping of
the 17q22 breast cancer locus using dense SNPs, genotyped within the
Collaborative Oncological Gene‑Environment Study (COGs). Sci Rep.
2016;6:32512.
46. Michailidou K, Beesley J, Lindstrom S, Canisius S, Dennis J, Lush MJ,
Maranian MJ, Bolla MK, Wang Q, Shah M, et al. Genome‑wide association
analysis of more than 120,000 individuals identifies 15 new susceptibility
loci for breast cancer. Nat Genet. 2015;47(4):373–80.
47. Little RJA, Rubin DB: Statistical analysis with missing data. In: Wiley series
in probability and statistics. Third edition edn. Hoboken, NJ: Wiley,; 2019: 1
online resource.
48. Mavaddat N, Michailidou K, Dennis J, Lush M, Fachal L, Lee A, Tyrer
JP, Chen TH, Wang Q, Bolla MK, et al. Polygenic Risk Scores for Predic‑
tion of Breast Cancer and Breast Cancer Subtypes. Am J Hum Genet.
2019;104(1):21–34.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in pub‑
lished maps and institutional affiliations.
... It is conceivable that the stratified model, which first stratified patients into subgroups, and then applied individual based risk stratification, can be used to predict biologically relevant clinical outcomes. The concept of "subtype specific" biomarkers has been successfully applied to improve the prognosis of many kinds of cancers (Hu et al., 2021;Ahearn et al., 2022;Zhang et al., 2022). Therefore, integrating subtype analysis and individual based models may be a promising method to develop clinically relevant biomarkers. ...
Article
Full-text available
Background: The mechanism of DNA damage repair plays an important role in many solid tumors represented by cervical cancer. Purpose: The purpose of this study was to explore the effect of DNA damage repair-related genes on immune function of patients with cervical cancer, and to establish and evaluate a prognosis model based on DNA damage repair-related genes. Methods: In the study, we analyzed the genes related to DNA damage and repair, and obtained two subtypes (F1 and F2). We selected two groups of samples for different selection, and studied which pathways were enriched expression. For different subtypes, the immune score was explored to explain immune infiltration. We got the key genes through screening, and established the prognosis model through the key genes. These 11 key genes were correlated with the expression of common Clusters of Differentiation (CD) genes in order to explore the effects of these genes on immunity. Results: Through the Least absolute shrinkage and selection operator (LASSO) method, we screened 11 genes from 232 candidate genes as the key genes for the prognosis score. Through the Kaplan-Meier method, four genes (HAP1, MCM5, RNASEH2A, CETN2) with significant prognostic significance were screened into the final model, forming a Nomogram with C-index of 0.716 (0.649-1.0). Conclusion: In cervical cancer, DNA damage repair related genes and immune cell infection characteristics have certain association, and DNA damage repair related genes and immune cell infection characteristics can effectively predict the prognosis.
Article
Objective This study aimed to assess the expression of NCAPH in human breast cancer, and to investigate its effects on breast cancer cells. Methods Bioinformation analysis was performed to analyze the expression of NCAPH in human breast cancer tissues and normal tissues in TCGA database. qPCR and Immunoblot assays were performed to clarify the expression of NCAPH in breast cancer tissues and cell lines, respectively. CCK-8, colony formation, FCM, transwell, and immunoblot assays were performed to reveal the effects of NCAPH on breast cancer proliferation, cell cycle, motility and EMT of breast cancer cells. Additionally, immunoblot assays were performed to investigate the effects of NCAPH on the PI3K/AKT pathway in breast cancer. Results We found that NCAPH was highly expressed in human breast cancer cell lines. The depletion of NCAPH suppressed the viability of breast cancer cells. Further, we noticed that its downregulation restrained breast cancer cell migration as well as invasion, and the EMT process. Mechanically, we noticed that NCAPH mediated the PI3K/AKT pathway, and therefore contributed to breast cancer progression. Conclusion In summary, NCAPH has the potential to serve as a breast cancer target.
Article
Full-text available
Squamous cell carcinomas (SqCC) of the aerodigestive tract have similar etiological risk factors. Although genetic risk variants for individual cancers have been identified, an agnostic, genome-wide search for shared genetic susceptibility has not been performed. To identify novel and pleotropic SqCC risk variants, we performed a meta-analysis of GWAS data on lung SqCC (LuSqCC), oro/pharyngeal SqCC (OSqCC), laryngeal SqCC (LaSqCC) and esophageal SqCC (ESqCC) cancers, totaling 13,887 cases and 61,961 controls of European ancestry. We identified one novel genome-wide significant (Pmeta<5x10-8) aerodigestive SqCC susceptibility loci in the 2q33.1 region (rs56321285, TMEM273). Additionally, three previously unknown loci reached suggestive significance (Pmeta<5x10-7): 1q32.1 (rs12133735, near MDM4), 5q31.2 (rs13181561, TMEM173) and 19p13.11 (rs61494113, ABHD8). Multiple previously identified loci for aerodigestive SqCC also showed evidence of pleiotropy in at least another SqCC site, these include: 4q23 (ADH1B), 6p21.33 (STK19), 6p21.32 (HLA-DQB1), 9p21.33 (CDKN2B-AS1) and 13q13.1(BRCA2). Gene-based association and gene set enrichment identified a set of 48 SqCC-related genes to DNA damage and epigenetic regulation pathways. Our study highlights the importance of cross-cancer analyses to identify pleiotropic risk loci of histology-related cancers arising at distinct anatomical sites.
Article
Full-text available
We describe temporal trends in breast cancer incidence by molecular subtypes in Scotland because public health prevention programmes, diagnostic and therapeutic services are shaped by differences in tumour biology. Population-based cancer registry data on 72,217 women diagnosed with incident primary breast cancer from 1997 to 2016 were analysed. Age-standardised rates (ASR) and age-specific incidence were estimated by tumour subtype after imputing the 8% of missing oestrogen receptor (ER) status. Joinpoint regression and age–period–cohort models were used to assess whether significant differences were observed in incidence trends by ER status. Overall, ER-positive tumour incidence increased by 0.4%/year (95% confidence interval (CI): −0.1, 1.0). Among routinely screened women aged 50–69 years, we observed an increase in ASR from 1997 to 2011 (1.6%/year, 95% CI: 1.2–2.1). ER-negative tumour incidence decreased among all ages by 2.5%/year (95% CI: −3.9 to −1.1%) over the study period. Compared with the 1941–1959 birth cohort, women born in 1912–1940 had lower incidence rate ratios (IRR) for ER+ tumours and women born in 1960–1986 had lower IRR for ER− tumours. Future incidence and survival reporting should be monitored by molecular subtypes to inform clinical planning and cancer control programmes.
Article
Full-text available
Breast cancer susceptibility variants frequently show heterogeneity in associations by tumor subtype1–3. To identify novel loci, we performed a genome-wide association study including 133,384 breast cancer cases and 113,789 controls, plus 18,908 BRCA1 mutation carriers (9,414 with breast cancer) of European ancestry, using both standard and novel methodologies that account for underlying tumor heterogeneity by estrogen receptor, progesterone receptor and human epidermal growth factor receptor 2 status and tumor grade. We identified 32 novel susceptibility loci (P < 5.0 × 10−8), 15 of which showed evidence for associations with at least one tumor feature (false discovery rate < 0.05). Five loci showed associations (P < 0.05) in opposite directions between luminal and non-luminal subtypes. In silico analyses showed that these five loci contained cell-specific enhancers that differed between normal luminal and basal mammary cells. The genetic correlations between five intrinsic-like subtypes ranged from 0.35 to 0.80. The proportion of genome-wide chip heritability explained by all known susceptibility loci was 54.2% for luminal A-like disease and 37.6% for triple-negative disease. The odds ratios of polygenic risk scores, which included 330 variants, for the highest 1% of quantiles compared with middle quantiles were 5.63 and 3.02 for luminal A-like and triple-negative disease, respectively. These findings provide an improved understanding of genetic predisposition to breast cancer subtypes and will inform the development of subtype-specific polygenic risk scores. Genome-wide analysis identifies 32 loci associated with breast cancer susceptibility, accounting for estrogen receptor, progesterone receptor and human epidermal growth factor receptor 2 status and tumor grade.
Article
Full-text available
Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes. Fine-mapping of causal variants and integration of epigenetic and chromatin conformation data identify likely target genes for 150 breast cancer risk regions.
Article
Full-text available
This article is the American Cancer Society's biennial update on female breast cancer statistics in the United States, including data on incidence, mortality, survival, and screening. Over the most recent 5‐year period (2012‐2016), the breast cancer incidence rate increased slightly by 0.3% per year, largely because of rising rates of local stage and hormone receptor‐positive disease. In contrast, the breast cancer death rate continues to decline, dropping 40% from 1989 to 2017 and translating to 375,900 breast cancer deaths averted. Notably, the pace of the decline has slowed from an annual decrease of 1.9% during 1998 through 2011 to 1.3% during 2011 through 2017, largely driven by the trend in white women. Consequently, the black–white disparity in breast cancer mortality has remained stable since 2011 after widening over the past 3 decades. Nevertheless, the death rate remains 40% higher in blacks (28.4 vs 20.3 deaths per 100,000) despite a lower incidence rate (126.7 vs 130.8); this disparity is magnified among black women aged <50 years, who have a death rate double that of whites. In the most recent 5‐year period (2013‐2017), the death rate declined in Hispanics (2.1% per year), blacks (1.5%), whites (1.0%), and Asians/Pacific Islanders (0.8%) but was stable in American Indians/Alaska Natives. However, by state, breast cancer mortality rates are no longer declining in Nebraska overall; in Colorado and Wisconsin in black women; and in Nebraska, Texas, and Virginia in white women. Breast cancer was the leading cause of cancer death in women (surpassing lung cancer) in four Southern and two Midwestern states among blacks and in Utah among whites during 2016‐2017. Declines in breast cancer mortality could be accelerated by expanding access to high‐quality prevention, early detection, and treatment services to all women.
Article
Full-text available
Stratification of women according to their risk of breast cancer based on polygenic risk scores (PRSs) could improve screening and prevention strategies. Our aim was to develop PRSs, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset and to empirically validate the PRSs in prospective studies. The development dataset comprised 94,075 case subjects and 75,017 control subjects of European ancestry from 69 studies, divided into training and validation sets. Samples were genotyped using genome-wide arrays, and single-nucleotide polymorphisms (SNPs) were selected by stepwise regression or lasso penalized regression. The best performing PRSs were validated in an independent test set comprising 11,428 case subjects and 18,323 control subjects from 10 prospective studies and 190,040 women from UK Biobank (3,215 incident breast cancers). For the best PRSs (313 SNPs), the odds ratio for overall disease per 1 standard deviation in ten prospective studies was 1.61 (95%CI: 1.57-1.65) with area under receiver-operator curve (AUC) = 0.630 (95%CI: 0.628-0.651). The lifetime risk of overall breast cancer in the top centile of the PRSs was 32.6%. Compared with women in the middle quintile, those in the highest 1% of risk had 4.37- and 2.78-fold risks, and those in the lowest 1% of risk had 0.16- and 0.27-fold risks, of developing ER-positive and ER-negative disease, respectively. Goodness-of-fit tests indicated that this PRS was well calibrated and predicts disease risk accurately in the tails of the distribution. This PRS is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs.
Article
The Genotype-Tissue Expression (GTEx) project has identified expression and splicing quantitative trait loci in cis (QTLs) for the majority of genes across a wide range of human tissues. However, the functional characterization of these QTLs has been limited by the heterogeneous cellular composition of GTEx tissue samples. We mapped interactions between computational estimates of cell type abundance and genotype to identify cell type-interaction QTLs for seven cell types and show that cell type-interaction expression QTLs (eQTLs) provide finer resolution to tissue specificity than bulk tissue cis-eQTLs. Analyses of genetic associations with 87 complex traits show a contribution from cell type-interaction QTLs and enables the discovery of hundreds of previously unidentified colocalized loci that are masked in bulk tissue.
Article
Cancers are routinely classified into subtypes according to various features, including histopathological characteristics and molecular markers. Previous genome-wide association studies have reported heterogeneous associations between loci and cancer subtypes. However, it is not evident what is the optimal modeling strategy for handling correlated tumor features, missing data, and increased degrees-of-freedom in the underlying tests of associations. We propose to test for genetic associations using a mixed-effect two-stage polytomous model score test (MTOP). In the first stage, a standard polytomous model is used to specify all possible subtypes defined by the cross-classification of the tumor characteristics. In the second stage, the subtype-specific case-control odds ratios are specified using a more parsimonious model based on the case-control odds ratio for a baseline subtype, and the case-case parameters associated with tumor markers. Further, to reduce the degrees-of-freedom, we specify case-case parameters for additional exploratory markers using a random-effect model. We use the Expectation-Maximization algorithm to account for missing data on tumor markers. Through simulations across a range of realistic scenarios and data from the Polish Breast Cancer Study (PBCS), we show MTOP outperforms alternative methods for identifying heterogeneous associations between risk loci and tumor subtypes. The proposed methods have been implemented in a user-friendly and high-speed R statistical package called TOP (https://github.com/andrewhaoyu/TOP).
Article
Rare truncating BRCA2 K3326X (rs11571833) and pathogenic CHEK2 I157T (rs17879961) variants have previously been implicated in familial pancreatic ductal adenocarcinoma (PDAC), but not in sporadic cases. The effect of both mutations in important DNA repair genes on sporadic PDAC risk may shed light on the genetic architecture of this disease. Both mutations were genotyped in germline DNA from 2,935 sporadic PDAC cases and 5,626 control subjects within the PANcreatic Disease ReseArch (PANDoRA) consortium. Risk estimates were evaluated using multivariate unconditional logistic regression with adjustment for possible confounders such as sex, age and country of origin. Statistical analyses were two‐sided with P values < 0.05 considered significant. K3326X and I157T were associated with increased risk of developing sporadic PDAC (odds ratio (ORdom) = 1.78, 95% confidence interval (CI) = 1.26 ‐ 2.52, P = 1.19 x 10‐3 and ORdom = 1.74, 95% CI = 1.15 ‐ 2.63, P = 8.57 x 10‐3 respectively). Neither mutation was significantly associated with risk of developing early‐onset PDAC. This retrospective study demonstrates novel risk estimates of K3326X and I157T in sporadic PDAC which suggest that upon validation and in combination with other established genetic and non‐genetic risk factors, these mutations may be used to improve pancreatic cancer risk assessment in European populations. Identification of carriers of these risk alleles as high‐risk groups may also facilitate screening or prevention strategies for such individuals, regardless of family history. This article is protected by copyright. All rights reserved.