Content uploaded by Marian Beekman
Author content
All content in this area was uploaded by Marian Beekman on Feb 15, 2019
Content may be subject to copyright.
5 JANUARY 2017 | VOL 541 | NATURE | 81
LETTER doi:10.1038/nature20784
Epigenome-wide association study of body mass
index, and the adverse outcomes of adiposity
A list of authors and affiliations appears at the end of the paper.
Approximately 1.5 billion people worldwide are overweight or
affected by obesity, and are at risk of developing type 2 diabetes,
cardiovascular disease and related metabolic and inflammatory
disturbances1,2. Although the mechanisms linking adiposity to
associated clinical conditions are poorly understood, recent studies
suggest that adiposity may influence DNA methylation3–6, a key
regulator of gene expression and molecular phenotype
7
. Here we
use epigenome-wide association to show that body mass index
(BMI; a key measure of adiposity) is associated with widespread
changes in DNA methylation (187genetic loci with P < 1 × 10
−7
,
range P = 9.2 × 10−8 to 6.0 × 10−46; n = 10,261 samples). Genetic
association analyses demonstrate that the alterations in DNA
methylation are predominantly the consequence of adiposity,
rather than the cause. We find that methylation loci are enriched
for functional genomic features in multiple tissues (P < 0.05), and
show that sentinel methylation markers identify gene expression
signatures at 38loci (P < 9.0 × 10−6, range P = 5.5 × 10−6 to
6.1 × 10−35, n = 1,785 samples). The methylation loci identify genes
involved in lipid and lipoprotein metabolism, substrate transport
and inflammatory pathways. Finally, we show that the disturbances
in DNA methylation predict future development of type 2 diabetes
(relative risk per 1standard deviation increase in methylation risk
score: 2.3 (2.07–2.56); P = 1.1 × 10−54). Our results provide new
insights into the biologic pathways influenced by adiposity, and may
enable development of new strategies for prediction and prevention
of type 2 diabetes and other adverse clinical consequences of obesity.
Our study design is summarized in Extended Data Fig. 1. We carried
out epigenome-wide association in 5,387 individuals from the EPICOR
(n = 514), KORA (n = 2,193) and LOLIPOP (n = 2,680) population
studies (Supplementary Tables 1, 2, and Supplementary Information).
We studied individuals of European (EPICOR, KORA) and Indian
Asian (LOLIPOP) ancestry, both populations are known to be at high
risk of obesity and related metabolic disturbances2,8. DNA methylation
in genomic DNA from blood was quantified using an Illumina Infinium
450K human methylation array. Blood was chosen for the analysis as
it is a metabolically active tissue, with an important role in the adverse
inflammatory and vascular consequences of adiposity, and is widely
used for clinical diagnostic purposes.
Epigenome-wide association identified 278 CpG sites associated
with BMI at P < 1 × 10−7, distributed between 207 genetic loci
(Supplementary Tables 3, 4). At each locus, we identified the sentinel
marker (CpG site with lowest P value for association with BMI), and
carried out replication testing in separate samples of whole blood from
European and Indian Asian men and women in population-based
studies (n = 4,874; Supplementary Table 1). The association between
DNA methylation and BMI was replicated at 187 out of 207 markers
(associated with BMI at P < 0.05 in replication samples with directional
consistency, and at epigenome-wide significance in combined analysis
of discovery and replication data (Fig. 1 and Supplementary Table 3)).
Regional plots for the 187identified loci are shown in Supplementary
Figs 1 and 2. Effect sizes range from 6.3 ± 0.9 to 40.2 ± 3.1 kgm−2
change in BMI per unit increase in DNA methylation in blood (scale
for methylation 0–1, in which 1 represents 100% methylation), with
little evidence for heterogeneity between Europeans and Indian Asians
(Supplementary Table 3). At sevenloci, the associations between DNA
methylation and BMI are stronger amongst Indian Asians or Europeans
(heterogeneity P < 1.0 × 10−7), raising the possibility that some effects
may be population specific.
Sensitivity analyses show that our findings are robust to the choice of
analytic strategy. The associations of DNA methylation in blood with
BMI are not explained by population stratification caused by DNA
sequence variation, or by genetic confounding of single-nucleotide
polymorphisms (SNPs) in the probe sequence (Supplementary Table 5
and Supplementary Figs 3, 4). In addition, to address the possibility of
confounding by technical factors, we further replicated the associations
of DNA methylation in blood with BMI at 4loci, in 990 Europeans and
1,720 Indian Asians (LOLIPOP study), using pyrosequencing as an
alternative approach to quantification of methylation (P = 1.2 × 10
−7
to 2.1 × 10−12 for association of methylation with BMI; Supplementary
Table 6).
The 187identified methylation markers are strongly enriched for
CpG sites with intermediate levels of methylation, consistent with
the presence of mosaicism, that is, epigenetic heterogeneity, at these
loci (P = 1.4 × 10−22, Fisher’s test; Extended Data Fig. 2). To better
understand the underlying cellular events, and exclude changes in
cell subset composition as the basis for our findings, we carried out
replication testing of the sentinel loci in isolated white-blood-cell
subsets (monocytes, neutrophils, CD4+T cells and CD8+T cells,
n = 60; Supplementary Table 7). Epigenetic heterogeneity is present at
the majority of loci, in each of the cell subsets studied (Extended Data
Fig. 3 and Supplementary Table 8). The sentinel markers are enriched
for association with adiposity in each of the isolated cell subsets
(Extended Data Fig. 4 and Supplementary Table 8), and the relation-
ships between methylation and obesity are directionally consistent with
the discovery epigenome-wide association study at 130loci (CD4
+
,
P = 1.2 × 10−9, sign test) and 166loci (neutrophils, P = 5.6 × 10−35,
sign test) (Supplementary Table 9). Furthermore, effect sizes are direc-
tionally consistent and of similar magnitude between the isolated cell
subsets (Extended Data Fig. 5). The association of DNA methylation
with BMI therefore reflects epigenetic heterogeneity at the identified
loci, is independent of changes in cell subset distribution, and com-
prises an effect of adiposity on methylation that is shared across the
cell subsets studied.
To assess the relevance of our observations in blood to other meta-
bolically relevant tissues, we first compared methylation levels at
the 187loci in blood, subcutaneous and omental fat, liver, muscle,
spleen and pancreas9. Mean methylation levels at the 187loci
correlate moderately to strongly between the tissues (R = 0.37− 0.93,
P = 8.9 × 10−8 to 1.9 × 10−82 for the 21tissue pairs; Extended Data
Fig. 6 and Supplementary Fig. 5), supporting the view that methylation
levels in blood are related to methylation patterns in other tissues at the
CpG sites examined.
Inflammatory and hormonal disturbances in adipocytes of obese
people contribute to the development of insulin resistance and other
metabolic consequences of adiposity10. To better understand how our
findings in the blood might reflect processes in the adipose tissue,
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
82 | NATURE | VOL 541 | 5 JANUARY 2017
we therefore quantified the relationship between DNA methylation
and BMI in adipose tissue. We found that 120 of the CpG sites show
directional consistency for association with BMI in both adipose
tissue and blood (P = 1.3 × 10
−4
, binomial test), whereas 91sites are
associated with BMI in adipose tissue (P < 2.7 × 10−4, that is, P < 0.05
after Bonferroni correction for 187tests; Supplementary Table 10). The
associations of DNA methylation with BMI in adipose tissue are also
unlikely to be the result of differences in the composition of canonical
cell types. First we used principal component analysis to assess the
cryptic structure arising from variation in cell subset composition in
the methylation data. Including principal components as covariates
in regression models did not materially influence the association of
DNA methylation with BMI in adipose tissue (Supplementary Fig. 6).
In separate studies, we quantified DNA methylation in isolated
adipocytes from subcutaneous adipose tissue collected from morbidly
obese (BMI > 40 kg m−2, n = 24) and normal weight (n = 24)
individuals. Despite the small sample size, 6 out of 187sentinel markers
were associated with obesity at P < 2.7 × 10
−4
(P < 0.05 after Bonferroni
correction; Supplementary Table 11), and 108markers show relation-
ships with obesity that are directionally consistent with those observed
in the discovery epigenome-wide association study (P = 0.04). We sepa-
rately tested the association of our sentinel methylation markers with
BMI in samples of the liver (n = 55), as this is also a metabolically rele-
vant tissue. We found that 114CpG sites showed consistent direction
of association with BMI compared to our findings in blood (P = 0.001,
sign test; Supplementary Table 10). Our findings indicate that many of
the relationships between methylation and BMI in blood are shared by
adipose and liver cells, but also identify effects that are tissue specific.
Next, we used genetic association and the concept of Mendelian ran-
domization to investigate the potential causal relationships between
DNA methylation in blood and BMI11. We first identified SNPs influ-
encing DNA methylation in blood in cis (1 Mb, n = 4,034 people). We
then tested whether SNPs that influence methylation in blood also
influence BMI, and whether the predicted effects of SNPs on BMI
through methylation are consistent with the observed association.
We identified a single CpG (cg26663590: NFATC2IP) that showed
evidence of a genetic association for a causal role of methylation on
BMI (P = 9.6 × 10−7 for association of SNP rs11150675 near NFATC2IP
with BMI; Fig. 2a and Supplementary Table 12). In keeping with a
causal role for methylation at NFATC2IP underlying adiposity, baseline
levels of methylation at cg26663590 predict weight gain in longitudinal
population studies (P = 0.03, Supplementary Table 13). The NFATC2IP
locus contains the gene that encodes SH2B1, which is known to be
involved in energy and glucose homeostasis and has previously been
linked with obesity, including in genome-wide association studies12,13.
To investigate whether DNA methylation in blood is the consequence
of adiposity, we used a weighted genetic risk score that combines effects
across SNPs known to influence BMI (Fig. 2b and Supplementary Table 14).
We observed a strong correlation between predicted (through
BMI) and observed effects of BMI genetic risk score on methylation
(R2 = 0.65; P = 4.7 × 10−44) at the CpG sites evaluated. In particular,
genetic risk score was associated with DNA methylation at the ABCG1,
KLHL18 and FTH1P20 loci with P < 2.7 × 10−4 (corresponding to
P < 0.05 after Bonferroni correction for 187tests). An effect of BMI
on ABCG1 methylation is consistent with observations that weight
loss influences both ABCG1 expression in adipose tissue and ABCG1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
20
22
SDF4
TIE1
PDE4DIP
SLC41A1
TGFA
CRYGFP
ACOX2
GOLIM4
SLC7A11
FLT4
ARID1B
RFC2
RPL5P24
NACC2
SMC3
PRDM11
PLBD1
LINC00598
ZC3H14
ACSBG1
NFATC2IP
SLC7A5
SREBF1
PHOSPHO1
P4HB
DEDD2
ABCG1
WRAP73
DHCR24
S100A2
TMEM63A
SULT1C2
SLC11A1
PRICKLE2
SOX2−OT
ROPN1L
EXOC2
SYNJ2
TRIP6
LINC00964
BEND7
FAM53B
SLC43A1
CNTN1
SMIM2−AS1
GPR68
FSD2
84FNZ
CBFA2T3
AKAP10
VMP1
SBNO2
BCL3
PIK3IP1
C1orf127
PHBP3
LMNA
LINC00184
MAP3K2
HDAC4
ZPLD1
ABCC5
TRIO
JARID2
RPS6KA2
SH2B2
SLC45A4
ANAPC16
PTPRE
BSCL2
IGFBP6
RNASE10
IGHA2
MAN2A2
NOD2
ANKRD11
USP22
CPSF4L
CREB3L3
SLC1A5
CARD10
SPATA21
NFIA
BCAN
LPIN1
UGGT1
IL5RA
EAF2
BCL6
TMEM173
GABBR1
MAD1L1
ST7
ZC3H3
C10orf99
CTSD
CPT1A
DCN
CPNE6
ATP10A
SLCO3A1
BBS2
CLUH
RARA
SLC9A3R1
ATG4D
ANGPT4
TAB1
AHDC1
S1PR1
DARC
RNA5SP89
ZEB2
DLEC1
EEFSEC
MUC4
RNF145
LY6G6F
ICA1
SND1
AQP3
PCGF5
KCNQ1
BCO2
DRAM1
RPL10L
APBA2
AXIN1
DUS2L
ZZEF1
VPS25
RHBDF2
PKN1
PPIAP3
CRELD2
KIAA0319L
SARS
CD247
LBH
NFE2L2
KLHL18
1CCFE
LINC00504
NPM1
COX6A1P2
HOXA−AS2
RN7SL142P
ANXA1
LINC00263
PRR5L
BCL9L
UBE2L5P
HSPA2
MYO5C
SLC9A3R2
MAF
ACADVL
MEOX1
SOCS3
MYO9B
TOX2
STK40
PHGDH
RPS10P7
CDC42EP3
FTH1P20
NT5DC2
LINC00880
CEP135
SLC34A1
C6orf223
GARS
LOXL2
PRRC2B
ELOVL3
CD82
CLMP
UFM1
ELMSAN1
DAPK2
RMI2
GSE1
CHD3
NFE2L1
LGALS3BP
UPF1
LINC00649
–log10(P)
15
10
5
0
Figure 1 | Circos plot of the epigenome-wide association of DNA
methylation in blood with BMI. Results are presented as CpG-specific
association test results (− log10(P)) ordered by genomic position.
Green and blue symbols, CpG sites at loci reaching epigenome-wide
significance (P < 1 × 10−7); grey symbols, CpG sites at loci not reaching
epigenome-wide significance. Chromosome numbers are shown on
the inner ring. Tick marks on the outer ring identify the genomic loci
reaching epigenome-wide significance. The genes nearest to the sentinel
methylation markers at each of the 187loci are listed around the circos
plot.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
5 JANUARY 2017 | VOL 541 | NATU RE | 83
activity14,15, and by the close relationship between change in BMI
and change in methylation during longitudinal follow-up analyses
of participants in our population studies (Fig. 3 and Supplementary
Table 13). Although further studies are needed to consider mechanisms,
our findings suggest that adiposity determines the alterations in
methylation at the majority of the identified CpG sites.
We used genetic association separately to test the causal relation-
ships between BMI and DNA methylation in adipose tissue. The results
further confirm that in adipose tissue, as in blood, the differences
in methylation observed are primarily the consequence of adiposity
(R = 0.73, P = 1.6 × 10−32; Extended Data Fig. 7).
We carried out functional genomic analyses to investigate the
potential mechanisms linking the 187sentinel CpGs sites with
adiposity. The CpG sites are strongly enriched in active chromatin
sites, including at DNase hypersensitivity sites and the activating
histone marks H3K4me1 and H3K27ac in a wide range of cell lines
(P < 0.05; Supplementary Fig. 7), suggesting that the adiposity-related
methylation changes we identified occur at constitutive cis-regulatory
regions that operate across tissues. In keeping with a regulatory role,
DNA methylation at the 187CpG sites is enriched for association with
expression of cis genes (500 kb) in blood (Supplementary Tables 15,
16 and Extended Data Fig. 8). We find 44transcripts of 38annotated
genes that are associated with DNA methylation at P < 9.0 × 10−6
(that is, P < 0.05 after Bonferroni correction; Supplementary Table 16);
an approximate threefold enrichment compared to expectations under
the null hypothesis (P = 3.0 × 10
−4
; Extended Data Fig. 8). Sensitivity
analyses, by limiting assessment of the relationship between methy-
lation and gene expression to the nearest gene or Illumina-annotated
gene, identified five additional loci potentially associated with gene
expression (Supplementary Table 17). The strongest cis signals observed
were for cg09315878 with TNFRSF4 transcription (P = 7.2 × 10
−86
),
cg14476101 with PHGDH transcription (P = 1.0 × 10−64) and
cg09152259 with MAP3K2 transcription (P = 1.6 × 10
−67
). On average,
a 5% absolute change in methylation was associated with a 7% change
in gene expression across the 44transcripts identified (range 1.8%
for AKAP to 19% for SPNS3; Supplementary Table 16). Among
the 38methylation–gene expression associations observed in blood,
3were also found in adipose tissue (HOXA5, BBS2 and SELM (also
known as SELENOM) and 3 in the liver (ANXA1, LGALS3BP and
PHGDH) at P < 1.3 × 10
−3
(that is, P < 0.05 after Bonferroni correction
for 38tests), all with consistent direction of effect (Supplementary
Table 18), suggesting that the relationships between methylation and
6.0
4.0
2.0
0.0
–2.0
–4.0
–6.0
5.4
3.6
1.8
0.0
–1.8
–3.6
–5.4
–6.0 –4.0 –2.0 0 2.0 4.0 6.0 –5.4 –3.6 –1.8 01.8 3.6 5.4
Observed effect (×10–2)
a
b
Observed effect (×10–3)
Methylation consequence of BMIMethylation cause of BMI
CpG
SNPBMI
BMI
GRSCpG
FTH1P20
ABCG1
KLHL18
Predicted effectPredicted effect
Figure 2 | Genetic association studies to investigate the potential
relationships between BMI and DNA methylation in blood. a, Results
for the causality analysis investigating whether DNA methylation in blood
at the sentinel CpG sites influences BMI. Units are change in BMI per copy
of effect allele. For each sentinel CpG site, we identified the cis SNP (1 Mb)
most closely associated with DNA methylation levels. For each SNP, we
then determined the effect of SNP on BMI predicted via methylation
(x axis) and the directly observed effect of the SNP on BMI (y axis). Grey,
CpGs not significantly associated with a SNP; blue, CpGs significantly
associated with a SNP. For a single CpG (NFATC2IP) the associated SNP
is also associated with BMI and 95% CI error bars are shown. At the other
loci, there was little relationship between the effects of the SNPs on BMI
predicted through methylation and directly observed BMI (R2 = 0.00,
P = 0.86). b, Consequential analysis for a causality analysis investigating
whether DNA methylation in blood at the sentinel CpG sites is the
consequence of BMI. Units are change in methylation per unit change in
weighted genetic risk score. We identified the SNPs reported to influence
BMI in a genome-wide association meta-analysis12, and calculated a
weighted genetic risk score (see Methods). For each sentinel CpG site, we
then determined the effect of genetic risk score on methylation predicted
through BMI (x axis) and the directly observed effect of genetic risk
score on CpG (y axis). Three CpGs (ABCG1, KLHL18 and FTH1P20)
are associated with the genetic risk score at P < 2.7 × 10−4 (P < 0.05
after Bonferroni correction for 187 tests; 95% CI error bars shown). The
overall correlation between observed and predicted effects (R2 = 0.81;
P = 4.7 × 10−44) suggests that methylation in blood at the majority of CpG
sites is consequential to BMI.
–15
–10
–5
0
5
10
15
–50 –30 –10 10 30
50
Cross-sectional study
Longitudinal study
178 out of 187 markers with
consistent direction of
effect (P = 6.8 × 10
–42
)
Figure 3 | Relationship between DNA methylation in blood and
BMI amongst 1,435 participants of the KORA S4/F4 population
cohort. Cross-sectional results (x axis) are for the relationship between
methylation in blood and BMI at each of the 187sentinel CpG sites
in the baseline samples; longitudinal results are for the relationship
between change in methylation (in blood) and change in BMI after 7year
follow-up. Units for both axes are kgm−2 change in BMI per unit increase
in methylation (scale 0–1, in which 1 represents 100% methylation).
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
84 | NATURE | VOL 541 | 5 JANUARY 2017
gene expression are in part shared between blood, adipose and liver
tissue.
We prioritized genes as potential candidate genes involved in the
association between BMI and DNA methylation at the 187loci on the
basis of two criteria: (1) proximity, the gene nearest to the sentinel
methylation marker and (2) functional genomics, genes within
500 kb of the sentinel methylation marker showing association of
gene expression with methylation (Supplementary Table 19). These
criteria identified 210unique genes, many with established roles in
adipose tissue biology and insulin resistance (for example, ABCG1,
LPIN1, HOXA5, LMNA, CPT1A, SOCS3, SREBF1 and PHGDH;
Supplementary Tables 19, 20). Gene-set enrichment analyses show that
the 210candidate genes are enriched for genes involved in lipid and
lipoprotein metabolism, amino-acid and small-molecule transport, and
inflammatory pathways involving NF-κ B, MAPK, TAK1 (also known
as MAP3K7), IRAK2 and TRAF6 (Supplementary Table 21).
To investigate the potential clinical significance of the DNA
methylation changes, we first tested the cross-sectional relationship
of DNA methylation in blood with fasting glucose, insulin, HDL
(high-density lipoprotein) cholesterol, triglycerides, Haemoglobin
A1c (HbA1c) and other clinical traits. We found that 879 methylation–
clinical-trait pairs tested were significant at P < 2.1 × 10−5 (that is,
P < 0.05 after Bonferroni correction for the 2,431tests performed;
Supplementary Fig. 8 and Supplementary Table 22), consistent with
recent studies reporting close relationships of DNA methylation with
blood lipids and glucose traits16,17. We used genetic association to
investigate the potential causal relationships between DNA methylation
and the identified clinical traits. SNPs influencing methylation markers
in blood showed little evidence for association with the respective
clinical traits (Extended Data Fig. 9). By contrast, the predicted effect
of genetic risk score on DNA methylation through clinical trait is
correlated with the directly observed effect of genetic risk score on
methylation for HbA1c, HDL cholesterol, triglycerides and insulin
(P = 1 × 10
−3
to 2 × 10
−14
; Extended Data Fig. 9). Our findings suggest
that the methylation changes in blood may in part be a consequence
of the changes in lipid and glucose metabolism associated with BMI.
Finally, we tested whether DNA methylation levels in blood at the
187sentinel CpG sites predict new onset, incident type 2 diabetes, a
major clinical consequence associated with obesity, in parti cipants
of the LOLIPOP study (n = 2,664). In single-marker tests, 62 of
the 187methylation markers were associated with incident type 2
diabetes at P < 2.7 × 10
−4
(that is, P < 0.05 after Bonferroni correction;
Supplementary Table 23). The strongest association was observed for
the ABCG1 locus, a gene known to be involved in insulin secretion
and pancreatic β -cell function14,15. To integrate information across
CpG sites, we calculated a weighted methylation risk score as the sum
of methylation values at each of the markers associated with type 2
diabetes, weighted by marker-specific effect size. Methylation risk
score is strongly predictive of incident type 2 diabetes (relative risk
2.29 (95% confidence intervals (CI), 2.06–2.55) per 1s.d. change in
methylation risk score; P = 4.2 × 10−52). The association of methylation
risk score with incident type 2 diabetes was also found in Europeans
from the KORA study (relative risk, 2.51; (95% CI, 1.49–4.23) per 1s.d.
change in methylation risk score; P = 5.7 × 10
−4
), with no evidence for
heterogeneity of effect (P = 0.74). Methylation risk score predicts type
2 diabetes beyond traditional risk factors including BMI and waist–hip
ratio (Supplementary Table 24), and in particular identifies obese and
overweight individuals with a high risk of developing type 2 diabetes
in the future(relative risk for type 2 diabetes in obese subjects: 7.3
(4.1–12.9), P = 8.2 × 10−12 in the top compared to the lowest quar-
tile; Fig. 4). This risk of type 2 diabetes associated with DNA methyl-
ation markers, which we have estimated in our study is numerically
similar to, or greater than, the estimated risk conferred by traditional
risk factors, including overweight, obesity, central obesity, impaired
fasting glucose and hyperinsulinaemia (Extended Data Fig. 10).
Furthermore, DNA methylation remains strongly and independently
associated with risk of future type 2 diabetes even after adjustment for
adiposity and glycaemic measures. By contrast, emergent risk factors
such as C-reactive protein and amino acid concentrations showed little
evidence for an independent association with type 2 diabetes. Our
findings therefore raise the possibility that DNA-methylation markers
may help to identify individuals with metabolically unfavourable
adiposity that are at increased risk of developing type 2 diabetes.
Our large-scale epigenome-wide association study identifies and
replicates changes in DNA methylation associated with BMI in blood
and adipose tissue. The associations of methylation with BMI are inde
-
pendent of variation in cell subset composition and replicate in both
isolated white blood cells and isolated adipocytes. Genetic association
in both blood and adipose tissue supports the view that the changes
in DNA methylation are a consequence and not the cause of adiposity,
at the majority of the identified CpG sites. The presence of epigenetic
heterogeneity at the identified loci, even within isolated canonical cell
subsets, together with a graded relationship between methylation and
BMI, suggest epigenetic reprogramming within committed cell subsets
in response to adiposity, as recently described in other tissues18. In
keeping with this, the methylation loci are enriched for sites of open
chromatin in multiple tissues, consistent with the presence of consti-
tutive cis enhancers.
The candidate genes at these loci include genes with annotated
roles in lipid metabolism, amino acid and small molecule transport,
inflammation, as well as metabolic, cardiovascular, respiratory and
neoplastic disease. For example, TNFRSF4 and MAP3K2 encode
proteins involved in activation of NF-κ B
19
, whereas IL5RA is involved
in development and activation of eosinophil and other immune cells,
and is causally linked to asthma, eczema and cardiovascular disease20.
ABCG1 is involved in cholesterol and phospholipid transport, and
regulates insulin secretion
17,21
. Our observations thus provide insight
into the regulatory pathways that may link adiposity to metabolic and
cardiovascular disease, asthma and a wide range of cancers, although
our study is limited in the tissues examined, and further studies
are needed to include additional biologically relevant tissues. Our
prospective population studies show that DNA methylation iden-
tifies people at high risk of incident type 2 diabetes, independent of
conventional risk factors. Further studies are needed to examine
whether DNA-methylation markers may be useful in distinguishing
metabolically unhealthy obesity. This may prove useful in risk stratifi-
cation and personalized medicine, to help to tackle the current global
epidemic of obesity and its associated cardiovascular and metabolic
disturbances.
024681012141
618
Odds ratio
Controls/cases PP trend
Q1
144/29 3.85 × 10–7
Q2
141/32 4.97 × 10–1
Q3
130/43 2.61 × 10–2
Q4
106/69 1.89 × 10–7
Q1
129/27 9.50 × 10–1 5.66 × 10–19
Q2
185/78 7.70 × 10–4
Q3
169/115 9.00 × 10–8
Q4
301/321 4.00 × 10–16
Q1
27/17 2.50 × 10–3 4.19 × 10–7
Q2
50/28 5.20 × 10–4
Q3
59/61 5.10 × 10–10
Q4
149/255 7.90 × 10–22
Normal
Overweight
Obese
P interaction = 0.56
Figure 4 | Relative risk of incident type 2 diabetes by quartile
of methylation risk score amongst Indian Asians. Subjects were
normoglycaemic (HbA1c < 6% and fasting glucose < 6 mmoll−1)
and normal weight (BMI 18.5–24.9 kgm−2), overweight (BMI 25.0–
29.9 kgm−2) or obese (BMI ≥ 30.0 kgm−2). The interaction P value is for
the interaction between adiposity and DNA methylation in risk of type 2
diabetes.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
5 JANUARY 2017 | VOL 541 | NATU RE | 85
Online Content Methods, along with any additional Extended Data display items and
Source Data, are available in the online version of the paper; references unique to
these sections appear only in the online paper.
Received 15 March 2015; accepted 10 November 2016.
Published online 21 December 2016.
1. Wang, Y. C., McPherson, K., Marsh, T., Gortmaker, S. L. & Brown, M. Health and
economic burden of the projected obesity trends in the USA and the UK.
Lancet 378, 815–825 (2011).
2. Ng, M. et al. Global, regional, and national prevalence of overweight and obesity
in children and adults during 1980–2013: a systematic analysis for the Global
Burden of Disease Study 2013. Lancet 384, 766–781 (2014).
3. Dick, K. J. et al. DNA methylation and body-mass index: a genome-wide
analysis. Lancet 383, 1990–1998 (2014).
4. Feinberg, A. P. et al. Personalized epigenomic signatures that are stable
over time and covary with body mass index. Sci. Transl. Med. 2, 49ra67
(2010).
5. Xu, X. et al. A genome-wide methylation study on obesity: dierential variability
and dierential methylation. Epigenetics 8, 522–533 (2013).
6. Demerath, E. W. et al. Epigenome-wide association study (EWAS) of BMI, BMI
change and waist circumference in African American adults identies multiple
replicated loci. Hum. Mol. Genet. 24, 4464–4479 (2015).
7. Portela, A. & Esteller, M. Epigenetic modications and human disease.
Nat. Biotechnol. 28, 1057–1068 (2010).
8. Danaei, G. et al. National, regional, and global trends in fasting plasma glucose
and diabetes prevalence since 1980: systematic analysis of health
examination surveys and epidemiological studies with 370 country–years and
2·7 million participants. Lancet 378, 31–40 (2011).
9. Slieker, R. C. et al. Identication and systematic annotation of tissue-specic
dierentially methylated regions using the Illumina 450k array. Epigenetics
Chromatin 6, 26 (2013).
10. Rosen, E. D. & Spiegelman, B. M. What we talk about when we talk about fat.
Cell 156, 20–44 (2014).
11. Relton, C. L. & Davey Smith, G. Two-step epigenetic Mendelian randomization:
a strategy for establishing the causal role of epigenetic processes in pathways
to disease. Int. J. Epidemiol. 41, 161–176 (2012).
12. Speliotes, E. K. et al. Association analyses of 249,796 individuals reveal 18 new
loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).
13. Bochukova, E. G. et al. Large, rare chromosomal deletions associated with
severe early-onset obesity. Nature 463, 666–670 (2010).
14. Johansson, L. E. et al. Dierential gene expression in adipose tissue from obese
human subjects during weight loss and weight maintenance. Am. J. Clin. Nutr.
96, 196–207 (2012).
15. Aron-Wisnewsky, J. et al. Eect of bariatric surgery-induced weight loss on
SR-BI-, ABCG1-, and ABCA1-mediated cellular cholesterol eux in obese
women. J. Clin. Endocrinol. Metab. 96, 1151–1159 (2011).
16. Pfeierm, L. et al. DNA methylation of lipid-related genes aects blood lipid
levels. Circ Cardiovasc Genet 8, 334–342 (2015).
17. Hidalgo, B. et al. Epigenome-wide association study of fasting measures of
glucose, insulin, and HOMA-IR in the Genetics of Lipid Lowering Drugs and
Diet Network study. Diabetes 63, 801–807 (2014).
18. Donkin, I. et al. Obesity and Bariatric surgery drive epigenetic variation of
spermatozoa in humans. Cell Metab. 23, 369–378 (2016).
19. Karin, M. & Ben-Neriah, Y. Phosphorylation meets ubiquitination: the control of
NF-κ B activity. Annu. Rev. Immunol. 18, 621–663 (2000).
20. Brightling, C. E. et al. Benralizumab for chronic obstructive pulmonary disease
and sputum eosinophilia: a randomised, double-blind, placebo-controlled,
phase 2a study. Lancet Respir. Med. 2, 891–901 (2014).
21. Chambers, J. C. et al. Epigenome-wide association of DNA methylation markers
in peripheral blood from Indian Asians and Europeans with incident type 2
diabetes: a nested case–control study. Lancet Diabetes Endocrinol. 3, 526–534
(2015).
SimoneWahl1,2,3*, AlexanderDrong4*, BenjaminLehne5*, MarieLoh5,6,7*,
WilliamR.Scott5,8*, SonjaKunze1,2, Pei-ChienTsai9, JaninaS.Ried10,
WeihuaZhang5,11, YouwenYang5, SiliTan12, GiovanniFiorito13,14,
LudeFranke15, SimonettaGuarrera13,14, SilvaKasela16,17, JenniferKriebel1,2,3,
RebeccaC.Richmond18, MarcoAdamo19, UzmaAfzal5,11,
MikaAla-Korpela20,21,22, BenedettaAlbetti23, OleAmmerpohl24,
JaneF.Apperley25, MarianBeekman26, PierAlbertoBertazzi23,
S.LucasBlack27, ChristineBlancher28, Marc-JanBonder15, MarioBrosch29,
MarenCarstensen-Kirberg3,30, AntonJ.M.de Craen31‡, SimondeLusignan32,
AbbasDehghan33, MohamedElkalaawy19,34, KristaFischer16,
OscarH.Franco33, TomR.Gaunt18, JochenHampe29, MajidHashemi19,
AaronIsaacs33, AndrewJenkinson19, SujeetJha35, NorihiroKato36,
VittorioKrogh37, MichaelLaffan25, ChristaMeisinger2, ThomasMeitinger38,39,40,
ZuanYuMok12, ValeriaMotta23, HongKiatNg12, ZacharoulaNikolakopoulou41,
GeorgiosNteliopoulos25, SalvatorePanico42, NataliaPervjakova16,17,
HolgerProkisch38,39, WolfgangRathmann43, MichaelRoden3,30,44,
FedericaRota23, MichelleAnnRozario12, JohannaK.Sandling45,46,
ClemensSchafmayer47, KatharinaSchramm38,39, ReinerSiebert24,48,
P.ElineSlagboom26, PasiSoininen20,21, LisetteStolk49, KonstantinStrauch10,50,
E-ShyongTai51,52,53, LetiziaTarantini23, BarbaraThorand2,3,
EttjeF.Tigchelaar15, RosarioTumino54, AndreG.Uitterlinden55,
Corneliavan Duijn33, JoyceB.J.van Meurs49, PaoloVineis13,56,
AnandaRajithaWickremasinghe57, CiscaWijmenga15, Tsun-PoYang45,
WeiYuan9,58, AlexandraZhernakova15, RachelL.Batterham19,59,
GeorgeDaveySmith18, PanosDeloukas45,60,61, BastiaanT.Heijmans26,
ChristianHerder3,30, AlbertHofman33, CeciliaM.Lindgren4,62, LiliMilani16,
Pimvan der Harst15,63,64, AnnettePeters2,3,40, ThomasIllig1,2,65,66,
CarolineL.Relton18, MelanieWaldenberger1,2, Marjo-RiittaJärvelin67,68,69,70,
ValentinaBollati23, RichieSoong12,71, TimD.Spector9, JamesScott8,
MarkI.McCarthy4,72,73, PaulElliott5,74, JordanaT.Bell9§,
GiuseppeMatullo13,14§, ChristianGieger1,2§, JaspalS.Kooner8,11,74§,
HaraldGrallert1,2,3§ & JohnC.Chambers5,11,74,75§
Supplementary Information is available in the online version of the paper.
Acknowledgements Detailed acknowledgments are provided in the
Supplementary Information.
Author Contributions Data collection and analysis in the contributing
population studies, ALSPAC study: T.R.G., C.L.R., R.C.R., G.D.S.; EGCUT study:
K.F., S.Ka., L.M., N.P.; EPICOR study: G.F., S.G., V.K., G.M., S.P., R.T., P.V.; KORA
study: M.C.K., C.G., H.G., C.H., T.I., J.K., S.Ku., C.M., T.M., A.P., H.P., J.S.R., M.R.,
W.R., K.Sc., K.St., B.T., M.W., S.W.; Leiden Longevity Study: M.B., A.J.M.d.C.,
B.T.H., P.E.S.; LIFELINES study: M.J.B., L.F., P.v.d.H., E.F.T., C.W., A.Z.; LOLIPOP
study: B.A., U.A., C.B, P.A.B., V.B., J.C.C., A.Dr., P.E., M.R.J., S.J., J.S.K., M.A.K., N.K.,
B.L., C.M.L., M.Lo., S.d.L., M.I.M., V.M., Z.Y.M., H.K.N., F.R., M.A.R., J.S., P.S., R.So.,
W.R.S., E.S.T., L.T., S.T., A.R.W., W.Z.; Rotterdam Study: A.De., C.v.D., O.H.F., A.H.,
A.I., J.B.J.v.M., L.S., A.G.U.; TwinsUK study: J.T.B., P.D., J.K.S., T.D.S., P.C.T., T.P.Y.,
WY. Data collection and molecular analyses in isolated cell subsets; adipocytes:
M.A., R.L.B., J.C.C., M.E., M.H., A.J., J.S.K., Z.Y.M., H.K.N., M.A.R., J.S., R.So., W.R.S.,
S.T.; hepatocytes: O.A., M.Br., J.H., C.S., R.Si.; leucocytes: J.F.A., S.L.B., J.C.C.,
J.S.K., M.La., Z.Y.M., H.K.N., G.N., Z.N., M.A.R., R.So., W.R.S., S.T., Y.Y. Data analysis
and writing group; J.C.C., A.Dr., P.E., J.S.K., C.G., H.G., B.L., M.Lo., G.M., M.I.M., J.S.,
W.R.S., S.W.
Author Information Reprints and permissions information is available at
www.nature.com/reprints. The authors declare no competing financial
interests. Readers are welcome to comment on the online version of the
paper. Correspondence and requests for materials should be addressed to
J.C.C. (john.chambers@ic.ac.uk) , H.G. (harald.grallert@helmholtz-muenchen.de)
or J.S.K. (j.kooner@ic.ac.uk).
Reviewer Information Nature thanks M. Boehnke, J. M. Greally, B. Voight and
the other anonymous reviewer(s) for their contribution to the peer review of this
work.
1Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research
Centre for Environmental Health, Neuherberg, Germany. 2Institute of Epidemiology II,
Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg,
Germany. 3German Center for Diabetes Research (DZD), München-Neuherberg, Germany.
4Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3
7BN, UK. 5Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment
and Health, School of Public Health, Imperial College London, London W2 1PG, UK. 6Institute
of Health Sciences, P.O. Box 5000, FI-90014 University of Oulu, Finland. 7Translational
Laboratory in Genetic Medicine (TLGM), Agency for Science, Technology and Research
(A* STAR), 8A Biomedical Grove, Immunos, Level 5, Singapore 138648, Singapore. 8National
Heart and Lung Institute, Imperial College London, London W12 0NN, UK. 9Department of Twin
Research and Genetic Epidemiology, King’s College London, London SE1 7EH, UK. 10Institute
of Genetic Epidemiology, Helmholtz Zentrum München, German Research Center for
Environmental Health, Neuherberg, Germany. 11Ealing Hospital NHS Trust, Middlesex UB1
3HW, UK. 12Cancer Science Institute of Singapore, National University of Singapore, Singapore.
13Human Genetics Foundation—Torino, Torino, Italy. 14Medical Sciences Department, University
of Torino, Torino, Italy. 15University of Groningen, University Medical Center Groningen,
Department of Genetics, 9700 RB Groningen, The Netherlands. 16Estonian Genome Center,
University of Tartu, Riia 23b, 51010 Tartu, Estonia. 17Department of Biotechnology, Institute of
Molecular and Cell Biology, University of Tartu, Riia 23, 51010 Tartu, Estonia. 18MRC Integrative
Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol,
Bristol BS8 2BN, UK. 19UCLH Bariatric Centre for Weight Loss, Weight Management and
Metabolic and Endocrine Surgery, University College London Hospitals, Ground Floor West
Wing, 250 Euston Road, London NW1 2PG, UK. 20Computational Medicine, Faculty of Medicine,
University of Oulu and Biocenter Oulu, Oulu, Finland. 21NMR Metabolomics Laboratory, School
of Pharmacy, University of Eastern Finland, Kuopio, Finland. 22Computational Medicine, School
of Social and Community Medicine, University of Bristol and Medical Research Council
Integrative Epidemiology Unit, University of Bristol, Bristol, UK. 23EPIGET Lab, Department of
Clinical Sciences and Community Health, Università degli Studi di Milano and Fondazione
IRCCS Ca’Granda Ospedale Maggiore Policlinico, Milan, Italy. 24Institute of Human Genetics,
University Hospital Schleswig-Holstein, Kiel Campus, Kiel, Germany. 25Centre for Haematology,
Department of Medicine, Faculty of Medicine, Imperial College London, Hammersmith Campus,
London W12 0NN, UK. 26Molecular Epidemiology, Leiden University Medical Center, Leiden,
2333 ZC, The Netherlands. 27Section of Infectious Diseases and Immunity, Department of
Medicine, Imperial College London, London W12 0NN, UK. 28High Throughput Genomics—
Oxford Genomic Centre, Wellcome Trust Centre for Human Genetics, University of Oxford,
Oxford OX3 7BN, UK. 29Medical Department 1, University Hospital of the Technical University
Dresden, Dresden, Germany. 30Institute for Clinical Diabetology, German Diabetes Center,
Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, Düsseldorf,
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
86 | NATURE | VOL 541 | 5 JANUARY 2017
Histopathology Unit, ‘Civile—M.P. Arezzo’ Hospital, ASP 7, Ragusa, Italy. 55Departments of
Internal Medicine and Epidemiology, Erasmus Medical Centre, Rotterdam, The Netherlands.
56Epidemiology and Public Health, Imperial College London, London, UK. 57Department of
Public Health, Faculty of Medicine, University of Kelaniya, PO Box 6, Thalagolla Road, Ragama
11010, Sri Lanka. 58The Institute of Cancer Research, Surrey SM2 5NG, UK. 59Centre for
Obesity Research, Rayne Institute, Department of Medicine, University College London, London
WC1E 6JJ, UK. 60William Harvey Research Institute, Barts and The London School of Medicine
and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK. 61Princess Al-Jawhara
Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King
Abdulaziz University, Jeddah 21589, Saudi Arabia. 62Broad Institute of the Massachusetts
Institute of Technology and Harvard University, Cambridge, Massachusetts 02142, USA.
63University of Groningen, University Medical Center Groningen, Department of Cardiology,
9700 RB Groningen, The Netherlands. 64Durrer Center for Cardiogenetic Research,
ICIN—Netherlands Heart Institute, 3511 GC Utrecht, The Netherlands. 65Hannover Unified
Biobank, Hannover Medical School, Feodor-Lynen-Strasse 15, D-30625 Hanover, Germany.
66Institute of Human Genetics, Hannover Medical School, Carl-Neuberg-Strasse 1, D-30625
Hanover, Germany. 67Department of Epidemiology and Biostatistics, MRC Health Protection
Agency (HPE) Centre for Environment and Health, School of Public Health, Imperial College
London, London, UK. 68Biocenter Oulu, P.O. Box 5000, Aapistie 5A, FI-90014 University of Oulu,
Finland. 69Center for Life Course Epidemiology, Faculty of Medicine, P.O. Box 5000, FI-90014
University of Oulu, Finland. 70Unit of Primary Care, Oulu University Hospital, Kajaanintie 50,
PO Box 20, FI-90220 Oulu, 90029 OYS, Finland. 71Department of Pathology, National University
Hospital, Singapore. 72Oxford Centre for Diabetes Endocrinology and Metabolism, University of
Oxford, Oxford, UK. 73Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford OX3
7LJ, UK. 74Imperial College Healthcare NHS Trust, London W12 0HS, UK. 75Lee Kong Chian
School of Medicine, Nanyang Technological University, Singapore.
*These authors contributed equally to this work.
‡Deceased.
§These authors jointly supervised this work.
Germany. 31Gerontology and Geriatrics, Leiden University Medical Center, Leiden 2300 RC,
The Netherlands. 32Department of Clinical and Experimental Medicine, University of Surrey,
Guildford GU2 7PX, UK. 33Department of Epidemiology, Erasmus Medical Centre, Rotterdam,
The Netherlands. 34Clinical and Experimental Surgery Department, Medical Research Institute,
University of Alexandria, Hadara, Alexandria 21561, Egypt. 35Department of Endocrinology,
Diabetes and Obesity, Max Healthcare, New Delhi 110 017, India. 36Department of Gene
Diagnostics and Therapeutics, Research Institute, National Center for Global Health and
Medicine, Tokyo 1628655, Japan. 37Epidemiology and Prevention Unit, Fondazione IRCSS
Istituto Nazionale Tumori, Milano, Italy. 38Institute of Human Genetics, Helmholtz Zentrum
München, German Research Center for Environmental Health, Neuherberg, Germany.
39Institute of Human Genetics, Technical University Munich, München, Germany. 40DZHK
(German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich,
Germany. 41Vascular Biology Section, National Heart and Lung Institute, Faculty of Medicine,
Imperial College London, London SW3 6LY, UK. 42Dipartmento Di Medicina Clinica E Chirurgia
Federio II University, Naples, Italy. 43Institute for Biometrics and Epidemiology, German
Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf,
Düsseldorf, Germany. 44Department of Endocrinology and Diabetology, Medical Faculty,
Heinrich Heine University Hospital Düsseldorf, Düsseldorf, Germany. 45Wellcome Trust Sanger
Institute, Wellcome Trust Genome Campus, Hinxton, UK. 46Department of Medical Sciences,
Molecular Medicine and Science for Life Laboratory, Uppsala University, 751 44 Uppsala,
Sweden. 47Department of Visceral and Thoracic Surgery, University Hospital Schleswig-
Holstein, Kiel Campus, Kiel, Germany. 48Institute of Human Genetics, University Hospital of Ulm,
Albert-Einstein-Allee 11, D-89081 Ulm, Germany. 49Department of Internal Medicine, Erasmus
Medical Centre, Rotterdam, The Netherlands. 50Institute of Medical Informatics, Biometry and
Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich,
Germany. 51Department of Medicine, Yong Loo Lin School of Medicine, National University of
Singapore, Singapore 119228, Singapore. 52Saw Swee Hock School of Public Health, National
University of Singapore, Singapore 117597, Singapore. 53Duke-National University of
Singapore Graduate Medical School, Singapore 169857, Singapore. 54Cancer Registry and
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
METHODS
Population samples. Details of the population samples for discovery and replica-
tion are provided in the Supplementary Information. No statistical methods were
used to predetermine sample size. The experiments were not randomized. All
molecular assays were carried out blind to clinical status.
Quantification of DNA methylation. DNA methylation was quantified in
bisulfite-converted genomic DNA from whole blood, using the Illumina Infinium
HumanMethylation450 array for all samples. Cohort specific methods are
summarized in Supplementary Table 2. DNA methylation was quantified on a scale
of 0–1, in which 1 represents 100% methylation. Preprocessing and quality control
criteria are summarized in Supplementary Table 2.
The association of DNA methylation with BMI (a measure of adiposity) was
tested in each cohort separately by linear regression using an established analytic
strategy to reduce batch effects and other technical confounding effects in
quantification of DNA methylation, and to take the potential confounding effects
arising from cryptic alterations in the white blood cell composition of blood into
account. In brief, in the LOLIPOP and KORA studies, raw signal intensities were
retrieved using the function readIDAT of the R package minfi v.1.6.0 from the
Bioconductor open source software (http://www.bioconductor.org/), followed
by background correction with the function bgcorrect.illumina from the same
R package. Detection P values were derived using the function detectionP as the
probability of the total signal (methylation+ unmethylated) being detected above
the background signal level, as estimated from negative-control probes. Signals
with detection P ≥ 0.01 were removed. Similarly, signals summarized from less
than three functional beads on the chip were removed. Observations with less than
95% CpG sites providing a signal were subsequently excluded from the dataset.
To reduce non-biological variability between observations, data were quantile
normalized with the function normalizeQuantiles of the R package limma v.2.12.0
from Bioconductor, separately in six probe categories based on probe type and
colour channel. If not stated otherwise, this preprocessing pipeline was used for
all data used in downstream analyses.
In order to account for technical effects during the experiment, we performed
principal component analysis on the signal intensities for the 235positive control
probes on the 450k array, which assess multiple steps in the laboratory processing.
The resulting principal components are thought to capture technical variability in
the experiment and the first 20control-probe principal components were included
as covariates in the model to remove technical biases.
To estimate proportions of white-blood-cell types, we used a previously
described method
22
. They provide 500CpG sites showing the most pronounced
cell-type-specific methylation levels in an experiment based on purified cells.
Of these, 473CpGs were available on the 450k array. Following the proposed
procedure and using the R code provided with the manuscript (R function
projectWBC), we used these 473CpG sites to infer white-blood-cell proportions
(that is, the proportion of granulocytes, monocytes, B cells, CD4+ T cells, CD8+
T cells and natural killer cells) in our samples. These proportions were subsequently
used as covariates in the model to avoid cell-type confounding.
Epigenome-wide association. We per formed single-marker tests separately in each
cohort using linear regression to analyse the association of each autosomal CpG site
with BMI; association results are presented, as the change in BMI per unit change
in methylation (0–1 scale, corresponding to 0–100% change in methylation).
We adjusted for age, gender, smoking status, physical activity index and alcohol
consumption, as well as for the first 20control-probe principal components and
for the estimated white-blood-cell proportions; this set of covariates is henceforth
referred to as ‘discovery covariates’. We corrected the association results for the
genomic control inflation factor (GCin), in order to account for population stratifi-
cation and other forms of cryptic structure in the data, which can arise for instance
from unobserved confounding. Markers on the sex chromosomes were tested
similarly for association with BMI, but separately in men and women. Results
were combined across cohorts by inverse variance meta-analysis using METAL
v.2011-03-25 (http://www.sph.umich.edu/csg/abecasis/Metal/). The resulting
P values where then corrected for in a second round of genomic control (GCout).
There were 466,186autosomal markers for analysis after quality control.
We set the threshold for epigenome-wide significance as P < 1 × 10−7, to provide a
conservative Bonferroni correction for the number of markers tested23.
As additional analyses we also investigated the relationship between BMI and
DNA methylation amongst the 11,233X-chromosomal and 417Y-chromosomal
CpG sites assayed. Our sample size (n = 5,387 individuals) provides 80% power
to identify a change of 8.4 kgm
−2
in BMI per unit increase in methylation (that is,
0–1, in which 1 is 100% methylation) at P < 1.0 × 10−7.
To assess the stability of discovery results towards the analytic choices made, we
performed sensitivity analyses to determine the impact of control-probe principal
components, methylation principal components, and genetic principal components
as covariates. Specifically, we compared results from the discovery meta-analysis
when the first 10, 20, 30 and 40control-probe principal components were included
as covariates; 10 or 20principal components derived from a principal component
analysis on the matrix of methylation β values, 10 or 20principal components
derived from a principal component analysis on the matrix of methylation values
adjusted for the discovery covariates and BMI, or 5principal components der ived
from a principal component analysis on SNP data were included as covariates.
Principal component analysis of the methylation data was performed separately
for each cohort based on quantile normalized β values of autosomal probes without
missing data. Genetic principal components (SNP principal components) were
generated separately for each cohort and genotyping platform (Supplementary
Table 25). The correlation between SNP principal components and methylation
principal components was assessed using linear regression (Supplementary Fig. 9).
Discovery results are very stable towards the considered variations in covariates,
with correlations of effect sizes between the models varying between 0.99–1.0
(Supplementary Figs 3, 10). In addition, SNPs in the probe sequences did not mate-
rially affect the observed associations (Supplementary Fig. 4 and Supplementary
Table 5).
Replication testing. Markers associated with BMI at P < 1 × 10−7 in the discovery
experiment within± 500 kb of each other were considered a single genetic region.
At each locus we identified the CpG sites with lowest P value for association with
BMI (sentinel marker). Our choice of 1 Mb to define a genetic locus was made to
account for long-range enhancers.
At each locus we identified the sentinel marker, and carried out replication
testing in separate samples of whole blood from European and Indian Asian men
and women in population-based studies (n = 4,874, Supplementary Table 1). The
207sentinel CpG sites were assayed using the Illumina 40K methylation array;
cohort-specific details of analysis pipelines are described in Supplement ary Table 2.
Results were combined across discovery and replication by weighted z meta-
analysis. Epigenome-wide significance was set at P < 1 × 10
−7
providing Bonferroni
correction for the 466,186 autosomal markers tested. Our choice of threshold is
supported by the results of permutation testing23. 20 out of 207 markers did not
reach P < 0.05 in replication testing. However, all 20 showed consistent direction
of effect between discovery and replication stages (P = 1.9 × 10−6, binomial test;
Supplementary Table 3), suggesting that the majority are unlikely to be false
positive associations.
To assess whether the 187sentinel CpGs were enriched for intermediately
methylated CpGs (sites with 20–80% average methylation), we randomly generated
100,000sets of 187CpGs and determined the number of intermediately methy lated
CpGs for each of them in order to derive an expected distribution under the null
hypothesis of no enrichment. We then compared the observed number of inter-
mediately methylated CpGs for the 187sentinel CpGs against the nu ll distribution
to calculate an empirical P value.
An exact binomial test (R function binom.test) was used to test whether a
direction of effect between discovery and replication was observed more often
than expected by chance amongst the 20non-replicating CpG sites.
Replication by pyrosequencing. As a technical validation, we used pyrosequencing
to carry out replication testing of the relationship between DNA methylation and
BMI at 4loci, using samples of whole blood from 990Europ eans and 1,720Indian
Asians participating in the LOLIPOP study. Pyrosequencing was carried out using
biotinylated primers to amplify bisulfite-treated DNA (Supplementary Table 26).
The biotinylated PCR products were then immobilized on streptavidin-coated
Sepharose beads (GE Healthcare). Pyrosequencing was performed with the
PyroMark Q96 MGMT kit (Qiagen) on a PSQTM96 MA system (Biotage).
Samples for isolated white blood cell studies. 30obese (BMI > 35 kgm−2) and
30normal weight (BMI < 25 kgm−2) individuals were recruited at random from the
outpatient departments at Ealing and University College London Hospitals. All parti-
cipants gave written informed consent for inclusion in the study (research ethics com-
mittee references: 07/H0712/150, 13/LO/0477 and ID#09/H0715/65). Obese subjects
and normal weight controls were matched by age (within 5 years), sex and ethnicity.
Fluorescence activated cell sorting of white blood cells. For each participant,
we collected 12 ml whole blood (EDTA). Samples were processed immediately
to isolate white blood cell subsets (monocytes, neutrophils, CD4 and CD8
lymphocytes) through red blood cell lysis according to the manufacturer’s instruc-
tions (BioLegend) and staining of unlysed white-blood-cell subsets (> 20 min
in 50 μ l) Ca2+-free PBS with 5 mM EDTA and 1% human albumin; containing
1 μ l anti-CD14 PE-Cy7 (Clone-M5E2, BD), anti-CD16 BV510 (Clone-3G8,
BioLegend), anti-CD45 BV605 (Clone-HI30, BioLegend), anti-CD8 APC
(Clone-SK1, BioLegend); 2 μ l anti-CD3 PE (Clone-Leu-4, BD), anti-CD4 FITC
(Clone-RPA-T4, BioLegend). Stained samples were filtered to remove clumped
cells (30 μ m mesh, Miltenyi Biotec) and dead cells were stained (1 μ l Sytox Blue,
Life Technologies)24,25.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
Lysed, stained samples were sorted on a FACSAria II SORP cell sorter at a
flow rate of 6,000–9,000 events per second. Data was collected with FACSDiva8
and analysed with FlowJo V10. Fluorescence minus one negative controls (that
is, without the primary labelled antibody of interest) were used to determine
positive and negative boundaries for each gate in the experimental set up. Daily
Cytometer Set-up and Tracking quality control beads were run to ensure alignment
and parameterization of the cell sorter (Anti-Mouse Igκ and Negative Control,
BSA; Compensation Plus Particles, BD). Sytox Blue (450/50V nm) negative events
were considered to be live cells. FCS-A and SSC-A were then used to separate
granulocytes from monocyte and lymphocyte populations. Neutrophils (CD14−,
CD16
+
) were separated from other granulocytes. Monocytes were then separated
from lymphocytes in a two-stage process as CD14+, CD45+ and CD16− cells.
Finally, CD4
+
and CD8
+
cells were separated from other lymphocytes based on the
following staining patterns; CD4
+
cells: CD3
+
, CD4
+
, CD8
−
, CD14
−
and CD45
+
;
and CD8
+
cells: CD3
+
, CD4
−
, CD8
+
, CD14
−
and CD45
+
. Sorted cell subsets were
assessed for purity, then pelleted and snap-frozen for storage at − 80 °C. Average
purities were: neutrophils 98.3% (± 1.2% (s.d.)); monocytes 99.2% (± 0.7%); CD4+
lymphocytes 99.6% (± 0.4%); CD8+ lymphocytes 97.9% (± 2.0%).
Genomic DNA was isolated (Qiagen QIAshredder; Allprep DNA/RNA Micro)
according to manufacturer’s instructions. Isolated genomic DNA was quantified
(Qubit double-stranded DNA broad range assay) then stored at − 80 °C for
genome-wide DNA-methylation assays.
Quantification of DNA methylation and data processing. Genomic DNA (0.2–
1.0 μ g) was bisulfite converted using an EZ DNA Methylation-Direct Kit (Zymo
Research). In brief, DNA samples were bisulfite converted by incubation with
the CT conversion reagent for 8 min at 98 °C, 3.5 h at 64 °C, followed by 18 h at
4 °C in a thermocycler. The treated DNA was added to a Zymo-Spin IC Column,
desulfonated using M-desulphonation buffer, and then eluted from the column
in 12 μ l of M-elution buffer.
Methylation analysis of the bisulfite-treated DNA was performed using an
Illumina Infinium MethylationEPIC Beadchip (Illumina) according to the standard
protocol. In brief, 4 μ l of bisulfite-treated DNA was denatured, neutralized and
amplified with an overnight whole-genome amplification reaction. The amplified
DNA was then enzymatically fragmented, precipitated and resuspended in
hybridi zation buffer before being dispensed onto the MethylationEPIC beadchips
for hybridization. After hybridization, the beadchips were processed through a
primer-extension protocol and subsequently stained. Finally, the beadchips were
coated and imaged using the HiScan System (Illumina).
All samples passed quality control and principal component analysis showed
clear separation of cell-types. Methylation values for 179 (out of 187) sentinel
CpGs were retrieved, as described above for epigenome-wide association in blood,
and the difference in DNA methylation between obese cases and normal weight
controls was tested using linear regression, adjusted for age, gender and ethnicity.
Genetic association studies. We used genetic association and the concept of
Mendelian randomization to investigate for potent ial causal relationships between
DNA methylation and adiposity
11
. In brief, Mendelian randomizat ion goes back to
the more general instrumental variable concept. As an instrumental variable, it uses
a genetic variant (or a combination of genetic variants) Z associated with a variable
X in order to show a causal relation between X and another variable Y. It relies on
the fact that the alleles of a genetic variant are inherited randomly from parents
to offspring, so that the relation of a genetic variant with a phenotype should not
be confounded (with exceptions including population stratification). Thus, if the
effect of X on Y is causal and the study has enough power, Z should also associate
with Y. Specifically, the predicted association of Z with Y can be calculated as
follows, assuming linear relationships and assuming that Z is unrelated to Y given
X and unrelated to any unobserved confounders U (where α is the intercept, and β
and γ are the beta-coefficients for the relationships of X with Z and U, respectively):
αβ γ=+ +XZU
111
where γ1U plays the role of the error term that is per assumption unrelated to Z
αβ γ
αβαβ γγ
αβαβββγγ
αβ γ
=+ +
=+ ++ +
=+ +++
=+ +
YXU
ZU U
ZU
ZU
()
()
222
22111 2
22121 21 2
333
Therefore the predicted effect of Z on Y is β3 = β2β1
Unbiased estimation and formal inference on the causal effect β1 of X on Y
(where X and Y represent a CpG–phenotype pair) relies on strong genetic effects
and typically requires tens of thousands of samples for adequate power26. As these
sample sizes are currently not available for epigenomic datasets, we explored the
consistency of the predicted effect of Z on Y compared to the observed effect
instead, thereby obtaining some indication on the plausibility of a causal effect of
X on Y. This was done in two directions, studying causality of the effect of DNA
methylation (X) on BMI (Y) and of BMI (X) on DNA methylation (Y).
DNA methylation as determinant of BMI (causal analysis). To investigate
whether DNA methylation is a determinant of BMI (whereby X = D NA
methylation, Y = BMI) we used data on genetic variants from 4,034participants
of the KORA and LOLIPOP studies (Supplementary Table 25) to identify cis (1 Mb)
SNPs (Z) influencing methylation in blood at the 187sentinel CpG sites. The
associations between SNPs and methylation were tested in each dataset separately
using linear models with methylation as response and SNP as independent
variable, adjusting for the discovery covariates, and then combined by inverse
variance meta-analysis using METAL v.2011-03-25. Our sample size (nmax = 4,034
individuals) provides 80% power to identify a change in methylation of 0.5% (in
absolute terms) per allele copy at P < 5.0 × 10−8 (that is, genome-wide signifi-
cance). Results for all 173,367pairs reaching P < 5 × 10
−8
(conventional genome-
wide significance) are provided in Supplementary Table 27. We excluded 3CpGs
that shared no cis SNPs across all datasets, and a further 9CpGs because they had
SNPs within their probe-binding sequence. For the remaining 175CpG sites, the
single SNP with the lowest P value for association with methylation was chosen
as an instrumental variable (Supplement ary Table 28). As mentioned above, to be
an appropriate instrument, a SNP must not be directly associated with BMI (Y),
only through the respective CpG (X). For this purpose, we removed six CpG–SNP
pairs from the analysis because the corresponding SNPs remained associated with
BMI after adjustment for the sentinel CpG (cg07136133, cg08548559, cg09152259,
cg12484113, cg18120259 and cg26403843). Statistical significance was inferred at
P < 2.9 × 10−4 (corresponding to P < 0.05 after Bonferroni correction for 175tests).
To enable comparison with the observed effect of SNPs on BMI obtained
from published data, we assessed the relationship between DNA methylation and
adiposity in linear models, using an inverse-normal transformation of BMI as the
outcome variable to be consistent with the GIANT genome-wide association study
(GWAS)
12
. The associations between DNA methylation and inverse-normal trans-
formed BMI were quantified in LOLIPOP and KORA cohorts separately, followed
by inverse variance meta-analysis using METAL v.2011-03-25. We then calcu lated
the predicted effect sizes and standard errors (βpred and SEpred, respectively) as
follows (where the term A~ B refers to the magnitude of the relationship between
A and B in regression analysis):
ββ β=×
~~
pred CpGSNP BMICpG
ββ
=×+×+×
~~~~ ~~
SE
SE SE SE SE
pred CpGSNP
2BMICpG
2CpGSNP
2BMICpG
2BMICpG
2CpGSNP
2
The predicted effect sizes were compared to the observed effects of SNPs on BMI,
whereby the latter were obtained from a large published GWAS to increase power
12
.
Statistical significance for individual SNPs was inferred at P < 2.9 × 10
−4
. We used
correlation analysis to investigate the global relationship between predicted and
observed effects on BMI for the SNPs influencing DNA methylation across the
sentinel CpG sites.
DNA methylation as consequence of BMI (consequential analysis). To test the
hypothesis of DNA methylation being a consequence of BMI (whereby X = BMI,
Y = DNA methylation), we followed a similar procedure as described above for the
opposite direction with minor differences.
First, instead of using a single SNP as instrumental variable, we calculated a
weighted genetic risk score (GRS) comprising SNPs reported to influence BMI12.
Again, for the GRS to provide a valid instrument, the included SNPs must not
show direct association with the CpG (Y) but only through BMI (X). For this
purpose we removed three SNPs (rs12444979, rs10968576 and rs7359397) which
remained significantly associated at P < 8.4 × 10−6 (corresponding to P < 0.05 after
Bonferroni correction for the 187 × 32 tests performed) with at least one of the
sentinel CpGs after adjusting for BMI. The final GRS was calculated as the sum of
risk allele dosage of the remaining 29SNPs previously reported to associate with
BMI, weighted by the reported effect sizes12.
Second, the observed effects of GRS on DNA methylation were quantified using
linear models as described above adjusted for the discovery covariates amongst
participants of the KORA and LOLIPOP studies. Regression analysis was carried
out in the KORA and LOLIPOP cohorts separately and results combined by inverse
variance meta-analysis using METAL v.2011-03-25.
DNA methylation in blood and adiposity in prospective population studies.
We used data from the KORA (n = 1,435 Europeans) and LOLIPOP (n = 1513
Indian Asians) studies to examine the prospective, longitudinal association between
DNA methylation at baseline and subsequent change in BMI during follow-up. We
carried out linear regression with change in BMI during follow-up as response
variable, and technically adjusted baseline methylation as the predictor variable,
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
with age, sex, physical activity, smoking, alcohol intake, estimated white-blood-cell
proportions and BMI at baseline, as well as follow-up time as additional covariates.
Data were analysed in KORA and LOLIPOP separately, followed by inverse
variance meta-analysis using METAL v.2011-03-25.
We studied the longitudinal relationship between change in BMI and change
in DNA methylation amongst 1,435 participants of the KORA S4/F4 cohort
with methylation data available both at baseline and at the 7-year follow-up
time point. To ensure comparability of methylation measurements from the two
time points measured in two batches, methylation β-values were jointly adjusted
for the first 20principal components obtained from a principal component
analysis on the positive-control probes, and residuals were subsequently used as
adjusted methylation values. Linear models were used with change in BMI during
follow-up as response variable, and change in technically adjusted methylation
as independent variable, including age, sex, physical activity, smoking, alcohol
intake and estimated white-blood-cell proportions both at baseline and follow-up.
DNA methylation in adipose tissue. We investigated whether the observed
methylation markers in blood are representative of BMI-associated methylation
changes in adipose tissue. We used a dataset of 542adipose tissue samples from
the TwinsUK study to test association of the 187identified methylation markers
with BMI. The association of BMI with methylation was quantified using a
linear mixed-effects model adjusting for chip, for bisulfite conversion level and
bisulfite conversion efficiency, smoking state (3categories: current, former and
never smokers), alcohol intake (in g per day) and age, with zygosity and family
as random effects.
We carried out sensitivity analyses to assess the potential contribution of
cryptic structure arising from differences in cell composition of the adipose tissue
samples. In the absence of validated approaches for imputation and adjustment for
adipose-tissue cell-subset composition, and the potential limitations of published
reference-free approaches for separation of true and confounded signals
23
, we used
principal component analysis to quantify latent structure in the adipose-tissue
methylation data, and included the top 5 components as covariates in the regression
model.
We separately compared DNA methylation between paired samples of blood
and subcutaneous adipose tissue (available for the same n = 201 individuals,
TwinsUK). Blood methylation values were first adjusted for age, chip and chip
position, smoking state, alcohol intake, and estimated white-blood-cell subsets
by taking the residuals from a linear model with these as covariates. Similarly,
adipose tissue methylation values were adjusted for age, chip, bisulfite conversion
level, bisulfite conversion efficiency, smoking state, alcohol intake and the top 5
principal components from the adipose methylation data. Pearson’s correlation
was then determined between the adjusted methylation values.
Finally, we used genetic association to carry out causality analyses on the
association between BMI and DNA methylation in adipose tissue, as described
above for blood. We studied a subset of 325adipose tissue samples from the
TwinsUK cohort with genotype data available. Regression analyses in adipose
tissue between BMI, SNPs/genetic risk score and CpGs were carried out using
the R package lme4, and with smoking, alcohol intake, age, zygosity (random
effect), family (random- effect), beadchip, bisulfite conversion batch and bisulfite
conversion efficiency as covariates.
DNA methylation in isolated adipocytes. Subcutaneous adipose tissue
samples were obtained intraoperatively in 24morbidly obese individuals
(BMI > 40 kgm−2) undergoing laparoscopic bariatric surgery and 24healthy
controls (BMI < 30 kgm
−2
) undergoing non-bariatric laparoscopic abdominal
surgery. Participants were unrelated, between 18–60 years of age, from a multi-
ethnic background, and free from type 2 diabetes. Controls were matched to cases
by age, sex and ethnicity. All participants gave informed consent (Ethics committee
reference 13/LO/0477).
Adipose samples were processed immediately to isolate populations of primary
human adipocyte cells using established protocols
27
. Polypropylene plastic ware
was used to minimise adipocyte cell lysis. Adipose tissue samples were minced
into 1–2 mm
3
pieces and washed in Hank’s buffered salt solution (HBSS), before
digestion using type 1 collagenase (1 mgml
−1
, Worthington) in a water bath at
37 °C shaking at 100r.p.m. for approximately 45 min. Digested samples were
filtered through a 300-μ m nylon mesh to remove debris, and the filtered solution
centrifuged at low speed (500g; 5 min; 4 °C), to leave four layers, from top to
bottom: (1) oil, (2) mature adipocytes, (3) supernatant and (4) stromovascular
pellet. After removal of the oil layer, the mature adipocyte layer was collected by
pipette, washed in approximately 5× volume of HBSS and recentrifuged. After
3washes the adipocyte cell suspension was collected for snap-freezing and storage
at − 80C.
Genomic DNA and RNA were extracted from the isolated adipocytes using the
Qiagen AllPrep DNA/RNA/miRNA Universal Kit according to manufacturer’s
protocol for lipid-rich samples. Methylation of genomic DNA was quantified using
the Illumina HumanMethylation450 array in a single batch according to manu-
facturer’s specifications. Raw methylation data were preprocessed using R v.2.15.
Bead intensity was retrieved using the R package minfi v.1.6.0. Marker intensities
were quantile normalized for analysis. Principal component analysis of control
probe intensities was performed to quantify cryptic structure in the data arising
from technical factors. Logistic regression was used to examine the association of
each CpG site with morbid obesity compared to normal weight, adjusting for age,
sex and ethnicity, and the first five control probe principal components.
DNA methylation in liver tissue. Liver samples were obtained percutaneously
for patients undergoing liver biopsy for suspected NAFLD or intraoperatively for
assessment of liver histology. Normal control samples were recruited from samples
obtained for exclusion of liver malignancy during major oncological surgery. None
of the normal control individuals underwent pre-operative chemotherapy and
liver histology demonstrated absence of both cirrhosis and malignancy. Study
design, sampling method and data collection have been described in detail
elsewhere28. For methylation analysis, bisulfite conversion was performed using
the Zymo EZ DNA Methylation Kit (Zymo Research), and hybridization of the
Illumina HumanMethylation450 array (Illumina). mRNA expression analysis was
performed using the HuGene 1.1 ST gene (Affymetrix) according to the manu
-
facturers’ protocols. Hybridization signals were analysed using GenomeStudio
software (default settings; GenomeStudio v.2011.1, Methylation Analysis Module
v.1.9.0; Illumina Inc) and internal controls for normalization.
Cross-tissue methylation. For extended cross-tissue correlation analyses, publicly
available data (GSE48472) were downloaded from the Gene Expression Omnibus
(GEO) database
9
. Briefly, the dataset consists of 41samples from six individuals of
blood, liver, muscle, pancreas, subcutaneous fat, omentum and spleen analysed on
the 450k methylation array. Data from the 187CpG sites of interest were extracted
and plotted using the heatmap.2 func tion in the R package gplots (v.2.17.0). Mean
methylation levels for each CpG site across all samples within each tissue type were
used to test for pairwise correlation between tissue types.
Genomic annotation analyses. To test for functional enrichment of the 187CpG
sites associated with BMI, we used annotations of genomic context provided by
Illumina, and of histone modification ChIP peaks (H3K4me1, H3K4me3 and
H3K27Ac, marks of open chromatin) and DNaseI Hypersensitivity Sites in 127
different cell types in the Roadmap and ENCODE (Release 9, UCSC) datasets.
We mapped each probe on the Illumina 450k array background to the annotation
categories and recorded overlap at each probe as a binary variable. To determine
whether enrichment occurred more often than expected by chance, we generated
10,000sets of 187CpGs, each matched with the BMI sentinel CpGs for methylation
mean (± 2%) and standard deviation (± 0.2%), but otherwise selected at random.
For each epigenetic mark, we then calculated the number of overlapping sites
amongst the 187replicating markers (observed) and 10,000permuted sets of
187markers (expected). We calculated the fold enrichment as observed/mean
(expected) and obtained an empirical P value from the distribution of expected.
Gene expression studies. Transcriptome-wide measurements of gene expression
in blood along with measurements of DNA methylation from the same blood
sample were available for participants of both the KORA F4 (n = 703) and
LOLIPOP (n = 1,082, 907 Indian Asians, 175 Europeans) studies (Supplement ary
Table 15). KORA samples were analysed with the Illumina HumanHT-12 v3
BeadChip array. Blood sample collection and RNA isolation and preparation
have been described in detail
29,30
. Gene expression data were quantile normalized
and log2 transformed using the R package lumi, v.2.8.0, from Bioconductor in
R, v.2.14.2. In LOLIPOP, gene expression analysis was performed with the
Illumina HumanHT-12 v4 B eadChip array according to manufacturer’s protocol.
Background correction (using negative controls), quantile normalization and log2
transformation was performed using the R package limma (function neqc).
To examine associations of DNA methylation with gene expression we carried
out linear regression with log2 transformed gene expression as the response
variable and methylation β values as independent variable. In KORA, the model
was adjusted for the discovery covariates and technical covariates related to the
expression measurement (RNA integrity number, RNA amplification plate, sample
storage time). In LOLIPOP, the model was adjusted for age, sex, methylation
control probe principal components and technical covariates related to the expres-
sion measurement (RNA integrity number, RNA extraction batch, RNA conversion
batch, scanning batch, array and array position). Results were analysed in KORA,
LOLIPOP Indian Asians and LOLIPOP Europeans separately, then combined by
inverse-variance meta-analysis using METAL v.2011-03.25. Statistical significance
was inferred at P < 9.0 × 10−6 (that is, P < 0.05 after Bonferroni correction for
5,551CpG–expression pairs).
To assess whether the 187sentinel CpGs were enriched for association with
gene expression, we used the same testing concept as described above based on
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
constructing a null distribution from 10,000randomly selected, matched sets of
187CpGs. For each permuted set we determined the number of significantly asso-
ciated expression probes in cis (P < 9.0 × 10−6) as described above and compare the
resulting distribution with the observed number of gene expression associations
for the 187sentinel CpG sites to calculate an empirical P value.
Finally, we examined the association between DNA methylation and gene
expression in TwinsUK adipose tissue samples (n = 499) for the 44methylation–
expression pairs that were significant in blood. Expression values were adjusted for
age and chip using a linear model. The association of methylation and expression
was then determined in linear mixed-effects models with adjusted expression as
response and methylation as the independent variable, adjusting for age, chip,
bisulfite conversion level and bisulfite conversion efficiency, with zygosity and
family as random effects. After QC filtering of methylation and expression data,
results were available for 36methylation–expression pairs.
Candidate genes and gene-set-enrichment analyses. The standard Illumina
annotation does not identify genes for all CpG sites on the 450K microarray.
We therefore identified candidate genes based on the following criteria: (1) proximity,
gene nearest to the CpG site (n = 187 genes), and (2) gene expression, all local genes
(up to ± 500 kb) with expression associated with the marker at P < 0.05 after
Bonferroni correction for 5,551tests (n = 38genes). This resulted in a list of
210unique genes (Supplementar y Table 19).
Gene annotations were downloaded from Ensembl (https://grch37.ensembl.
org/index.html) using the R package biomaRt v.2.18.0 from Bioconductor, and
overlapped with the cg positions as annotated in the Illumina annotation using the
R package GenomicRanges v.1.14.4 from Bioconductor. We downloaded curated
pathway information (c2.all.v5.0.symbols.gmt) from the GSEA MSigDB plat-
form (http://www.broadinstitute.org/gsea/msigdb), resulting in 1,135pathways,
to investigate enrichment of the set of candidate genes against curated pathway
sets (BIOCARTA, KEGG, REACTOME). An enrichment P value was calculated
empirically on the basis of permutation testing, using the Benjamini–Hochberg
(false-discovery rate) procedure. For the sensitivity analysis, the gene-set-
enrichment analysis was repeated using the genes annotated by Illumina, and using
more permissive proximity criteria (Supplementary Table 29). Results become less
statistically significant when candidate gene selection based on proximity alone
was extended to include all genes over distances up to 500 kb.
DNA methylation and metabolic traits. We investigated the association between
the 187sentinel methylation markers and metabolic disturbances associated with
adiposity amongst participants of the KORA (n = 1,697) and LOLIPOP (n = 2,462)
studies with available measurements of the following BMI-related clinical traits:
LDL (low-density lipoprotein) cholesterol, HDL cholesterol, total cholesterol,
fasting triglycerides, fasting glucose, fasting insulin, HbA1c, systolic and diastolic
blood pressure, C-reactive protein, weight, height and waist–hip ratio. Linear
models were used with trait as response and methylation as independent variable,
adjusting for the discovery covariates. Results from KORA and LOLIPOP studies
were analysed separately, then combined by inverse variance meta-analysis using
METAL v.2011-03-25. Associations were considered significant at P < 2.1 × 10−5
(corresponding to P < 0.05 after Bonferroni correction for 187 × 13tests).
To investigate potential causal relationships between the methylation markers
and BMI-related clinical traits, we performed causality analyses as described above
for the primary phenotype (BMI). For each clinical trait, GWAS datasets of the
most comprehensive meta-analyses published to date with access to genome-wide
association results were retrieved (Supplementary Table 30), to provide SNPs influ-
encing trait. SNPs associated with multiple traits were assigned to the most strongly
associated trait (lowest P value). Clinical traits were transformed as described in the
respective GWAS. Genetic risk scores were calculated as described above for BMI,
after removal of SNPs with direct genomic effects (SNPs that remain associated
with the sentinel CpG after adjustment for the trait). Regression analyses were
carried out in the KORA F4 and LOLIPOP cohorts separately and results were
combined by inverse variance meta-analysis using METAL v.2011-03-25.
Association with incident type 2 diabetes. We tested the association of DNA
methylation at the 187identif ied CpG sites with incident type 2 diabetes amongst
participants of the LOLIPOP study. All participants (n = 2,664) were free from
type 2 diabetes at the time of measurement of DNA methylation; incident
type 2 diabetes (n = 1,074) was defined as either new physician diagnosis, or
HbA1c ≥ 6.5%. Associations with type 2 diabetes were evaluated by logistic
regression adjusted for the discovery covariates. We initially tested the association
in single-marker tests, then in a fully saturated model comprising all 187markers
to identify independent effects.
To combine information across loci, we calculated a weighted methylation risk
score as the sum of the standardised methylation values at each marker that reached
nominal significance (P < 0.05) in the fully saturated multivariate model, weighted
by marker-specific effect size. We then tested the association of the methylation
risk score with incident type 2 diabetes using logistic regression, before and after
adjustment for traditional type 2 diabetes risk factors (BMI, waist–hip ratio,
glucose and HbA1c).
Replication testing of the association of methylation risk score with type 2
diabetes was carried out in a nested case–control study within the KORA S3/S4
comprising 200subjects with newly diagnosed type 2 diabetes and 200controls
matched for age (± 2years), sex, cohort and observation time until diagnosis
of diabetes. Data were analysed using conditional logistic regression using the
function clogit of the R package survival v.2.37.4.
Software. Unless stated otherwise, all calculations were performed using R v.3.0.1.
For all meta-analyses, METAL v.2011-03-25 was used. Custom R code for the
respective analyses is available at: http://metabolomics.helmholtz-muenchen.de/
bmi_methylation/.
Data availability. Summary statistics from the epigenome-wide association
study can be accessed from the European Genome–phenome Archive
(EGAS00001001922; https://www.ebi.ac.uk/ega/studies/EGAS00001001922).
KORA methylation data are available upon request through the application tool
KORA.PASST (http:/epi.helmholtz-muenchen.de); LOLIPOP data are available
from the Gene Expression Omnibus (GSE55763); EPICOR data are deposited in
the HuGeF repository (http://www.hugef-torino.org) and are available on request.
22. Houseman, E. A. et al. DNA methylation arrays as surrogate measures of cell
mixture distribution. BMC Bioinformatics 13, 86 (2012).
23. Lehne, B. et al. A coherent approach for analysis of the Illumina
HumanMethylation450 BeadChip improves data quality and performance in
epigenome-wide association studies. Genome Biol. 16, 37 (2015).
24. Lyons, A. B. & Parish, C. R. Determination of lymphocyte division by ow
cytometry. J. Immunol. Methods 171, 131–137 (1994).
25. Park, D. et al. Noninvasive imaging of cell death using an Hsp90 ligand. J. Am.
Chem. Soc. 133, 2832–2835 (2011).
26. Burgess, S. Sample size and power calculations in Mendelian randomization
with a single instrumental variable and a binary outcome. Int. J. Epidemiol. 43,
922–929 (2014).
27. Spalding, K. L. et al. Dynamics of fat cell turnover in humans. Nature 453,
783–787 (2008).
28. Ahrens, M. et al. DNA methylation analysis in nonalcoholic fatty liver disease
suggests distinct disease-specic and remodeling signatures after bariatric
surgery. Cell Metab. 18, 296–302 (2013).
29. Schurmann, C. et al. Analyzing illumina gene expression microarray data from
dierent tissues: methodological aspects of data analysis in the metaxpress
consortium. PLoS One 7, e50938 (2012).
30. Döring, A. et al. SLC2A9 inuences uric acid concentrations with pronounced
sex-specic eects. Nat. Genet. 40, 430–436 (2008).
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
Extended Data Figure 1 | Study design. Epigenome-wide association and
replication testing was performed in order to identify methylation sites
associated with adiposity. In the discovery step, four large cohorts were
included with Illumina 450k DNA methylation data available, which were
preprocessed and quality controlled according to a harmonized protocol.
Epigenome-wide association was performed in every single study with
BMI as response variable and methylation β -value as independent variable,
adjusting for covariates as described in the Methods. At a genome-wide
significance level of P < 1 × 10−7, 278methylation sites from 207regions
were identified. In the replication step, 187 of these were replicated in
independent samples. Genetic association and causality analyses were
used in order to investigate whether the identified methylation signals
underlie the development of adiposity or are the consequence of adiposity.
The findings were supported by longitudinal analyses. The cross-tissue
analyses represent a first step towards extending our observations in
blood to metabolically relevant tissues. The functional genomics and gene
expression analyses help to link the observed methylation associations to
transcriptional outcomes, while the gene-set enrichment analysis provides
a way to summarize the potentially affected metabolic pathways. Finally,
we studied the relationships between methylation and adiposity-related
metabolic traits and type 2 diabetes to address the clinical relevance of our
findings.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
Extended Data Figure 2 | Distribution of methylation values at the
187sentinel CpG sites compared to the approximately 473,000 CpG
sites assayed by the Illumina Infinium 450K Human Methylation array.
The 187identified methylation–BMI associations are strongly enriched
for CpG sites with intermediate levels of methylation, consistent with the
presence of epigenetic heterogeneity at these loci in blood (157 out of
187 sites with 20–80% methylation, a 3.0-fold enrichment compared to
microarray background, P = 1.4 × 10−22 Fisher’s test).
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
Extended Data Figure 3 | DNA methylation at the sentinel CpG sites
in whole blood and in 4isolated cell subsets (monocytes, neutrophils,
CD4+ and CD8+ T cells) from 60individuals (30obese individuals
and 30normal weight controls) by Illumina MethylationEPIC array,
which quantifies 179 of the 187 sentinel markers. Results are shown
as a heatmap, coded by methylation value (hypomethylation < 0.2;
intermediate methylation 0.2–0.8; hypermethylation > 0.8). Results
show the presence of intermediate methylation (and hence epigenetic
heterogeneity) at the majority of loci, and in the majority of cell types, in
both cases and controls.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
Extended Data Figure 4 | Association of DNA methylation with obesity in the 4cell subsets studied, based on quantification of methylation
at 179sentinel methylation markers in 30obese individuals and 30normal weight controls. Results are presented as QQ plots of the observed
association test statistics in each of the isolated cell subsets. λ, the genomic control inflation factor.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
Extended Data Figure 5 | Comparison of effect sizes between isolated white cell subsets. Results are presented as the difference in methylation
between obese cases and normal weight controls (methylation in cases− methylation in controls, in absolute terms on percentage scale) in the respective
isolated white-blood-cell subset (y axis) compared to the average case− control difference across all 4cell subsets studied (x axis).
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
Extended Data Figure 6 | Mean methylation levels at the 187 sentinel methylation markers associated with BMI, across 7 tissue types. Bottom,
pairwise scatterplots (trend line in red). Top, the Pearson correlation coefficient and P values. Blood, n = 6; liver, n = 5; muscle, n = 6; omentum, n = 6;
pancreas, n = 4; subcutaneous (SC) fat, n = 6; spleen, n = 3.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
Extended Data Figure 7 | Causality analysis in adipose tissue to
investigate the potential relationships between BMI and DNA
methylation. Left, causality analysis in adipose tissue investigating
whether DNA methylation at sentinel CpG sites influences BMI. Units
are change in BMI per copy of effect allele. For each sentinel CpG site, we
determined the effect of a previously identified cis SNP on BMI predicted
through methylation (x axis) and the directly observed effect of a SNP
on BMI (y axis). No CpG passed multiple testing corrections for all three
comparisons. Overall there was little relationship between the effects of
SNPs on BMI predicted through methylation and the directly observed
effect (R = − 0.04, P = 0.58). Right, causality analysis in adipose tissue
investigating whether DNA methylation at sentinel CpG sites is the
consequence of BMI. Units are change in methylation per unit change in
weighted genetic risk score. We identified SNPs reported to influence BMI
in GWAS meta-analysis, and calculated a weighted genetic risk score. For
each sentinel CpG site we then determined the effect of genetic risk score
on methylation predicted via BMI (x axis) and the directly observed effect
of genetic risk score on methylation (y axis). No CpG passed multiple
testing corrections for all three comparisons. The overall correlation
between observed and predicted effects (R = 0.73, P = 1.6 × 10−32)
replicates our findings in blood that methylation at the majority of CpG
sites is consequential to BMI.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
Extended Data Figure 8 | The 187 sentinel CpGs are enriched for
association with gene-expression in cis in blood. a–d, To derive an
expectation under the null-hypothesis we generated 10,000 sets of
matched CpGs (matched for mean methylation and for s.d. of methylation,
see Methods), and tested their association with expression of the nearest
gene (a), the gene allocated to the CpG by the Illumina annotation
(b), all genes within a 500 kb distance (c) and all genes within a 500 kb
distance excluding the nearest gene (d). We observed significantly more
expression-probes associated with the sentinel markers (red arrow) in
blood compared to the 10,000 permuted sets (green bars).
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter reSeArCH
Extended Data Figure 9 | Summary statistics for the causality analyses
investigating the relationship between DNA methylation in blood and
metabolic disturbances. a, DNA methylation in blood as a potential
determinant of the metabolic disturbances associated with adiposity
(causal analysis). For each of the sentinel CpG sites, we identified the cis
SNP (1 Mb) most closely associated with DNA methylation levels. For each
of the SNPs, we then determined the effect of SNP on phenotype predicted
via methylation and the directly observed effect of SNP on phenotype.
Results are presented as the R2 between phenotype-specific observed
and predicted effects across the 187CpG sites, calculated using linear
regression. b, DNA methylation in blood as a potential consequence of the
metabolic disturbances associated with adiposity (consequential analysis).
We identified the SNPs reported to influence each phenotypic trait (using
the most recent GWAS meta-analysis; Supplementary Table 24), and
calculated phenotype-specific weighted genetic risk scores. For each of the
CpG sites, and each of the phenotypes, we then determined the effect of
genetic risk score on methylation predicted through phenotype, with the
directly observed effect of genetic risk score on methylation. Results are
presented as the R2 between phenotype-specific observed and predicted
effects across the 187CpG sites, calculated using linear regression.
P values are shown for correlations between observed and predicted
effects that reach P < 0.05.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.
Letter
reSeArCH
Extended Data Figure 10 | Association of established and emergent
biomarkers with type 2 diabetes. Results are presented as risk of type 2
diabetes associated with the specified biomarkers in three models: model 1,
adjusted for age and sex; model 2, as for model 1, but additionally
adjusted for body mass index and impaired fasting glucose; and model 3,
as for model 2, but additionally adjusted for central obesity and insulin
concentrations. CRP, C-reactive protein; MRS, methylation risk score.
Results for quantitative traits (amino acids, C-reactive protein, insulin
and methylation risk score) are presented as risk of type 2 diabetes in Q4
compared to Q1.
© 2017 Macmillan Publishers Limited, part of Springer Nature. All rights reserved.