Widespread plasticity in CTCF occupancy linked
to DNA methylation
Hao Wang,1,5Matthew T. Maurano,1,5Hongzhu Qu,1,2,5Katherine E. Varley,3
Jason Gertz,3Florencia Pauli,3Kristen Lee,1Theresa Canfield,1Molly Weaver,1
Richard Sandstrom,1Robert E. Thurman,1Rajinder Kaul,1Richard M. Myers,3
and John A. Stamatoyannopoulos1,4,6
1Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;2Laboratory of Disease Genomics
and Individualized Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China;3HudsonAlpha
Institute for Biotechnology, Huntsville, Alabama 35806, USA;4Department of Medicine, University of Washington, Seattle,
Washington 98195, USA
CTCF is a ubiquitously expressed regulator of fundamental genomic processes including transcription, intra- and in-
terchromosomal interactions, and chromatin structure. Because of its critical role in genome function, CTCF binding
patterns have long been assumed to be largely invariant across different cellular environments. Here we analyze ge-
nome-wide occupancy patterns of CTCF by ChIP-seq in 19 diverse human cell types, including normal primary cells and
immortal lines. We observed highly reproducible yet surprisingly plastic genomic binding landscapes, indicative of
strong cell-selective regulation of CTCF occupancy. Comparison with massively parallel bisulfite sequencing data in-
dicates that 41% of variable CTCF binding is linked to differential DNA methylation, concentrated at two critical
positions within the CTCF recognition sequence. Unexpectedly, CTCF binding patterns were markedly different in
normal versus immortal cells, with the latter showing widespread disruption of CTCF binding associated with increased
methylation. Strikingly, this disruption is accompanied by up-regulation of CTCF expression, with the result that both
normal and immortal cells maintain the same average number of CTCF occupancy sites genome-wide. These results
reveal a tight linkage between DNA methylation and the global occupancy patterns of a major sequence-specific
[Supplemental material is available for this article.]
The polyfunctional regulator CTCF plays a central role in multiple
complex genomic processes, including transcription (Baniahmad
et al. 1990; Filippova et al. 1996; Vostrov and Quitschke 1997),
imprinting (Bell and Felsenfeld 2000; Hark et al. 2000), and long-
range chromatin interactions and subnuclear localization (Yusufzai
et al. 2004; Splinter 2006; Hou et al. 2008). Cohesin, a major medi-
ator of chromosomal contacts during mitosis (Seitan et al. 2011), is
chromosome pairing(Parelhoetal.2008;Rubioetal.2008; Wendt
et al. 2008). CTCF has also been connected with multiple
malignancies, includingby theassociation ofmutationsinitsgene
locus (Filippova et al. 1998), through its anti-proliferative effect
(Rasko et al. 2001), and through regulatory interactions with tumor
suppressor genes (Butcher et al. 2004; Witcher and Emerson 2009;
Soto-Reyes and Recillas-Targa 2010; Da ´valos-Salas et al. 2011).
CTCF is ubiquitously expressed, and it is widely believed that
CTCF binding patterns are largely invariant between cell types
(Kim et al. 2007; Cuddapah et al. 2008; Heintzman et al. 2009),
though diverse regulatory mechanisms at individual loci have
been described (Lefevre et al. 2008; Sekimata et al. 2009; Witcher
and Emerson 2009; Lai et al. 2010; Shukla et al. 2011). In addition,
at a small number of loci, variable CTCF occupancy has been linked
and in vitro studies suggest that methylation may hinder CTCF
the degree to which CTCF binding patterns vary between different
cell types nor the relationship of such variability with DNA meth-
ylation is currently known.
We therefore sought to establish the cellular selectivity of
CTCF binding and to define its relationship with methylationon a
global scale. By using genome-wide occupancy profiling and re-
duced representation bisulfite sequencing (RRBS), we establish
that a majority of CTCF sites are cell-selective, and link 41% of
further observe markedly different CTCF binding patterns dis-
tinguishing normal and immortal cells, which are associated with
results indicate a global linkage between DNA methylation and the
occupancy patterns of an important genome regulator.
Widespread plasticity of CTCF occupancy patterns
To assess CTCF binding variation genome-wide, we localized and
quantified CTCF occupancy by ChIP-seq in 19 diverse cell types,
including seven immortal cell lines and 12 normal cell types. We
generated two biological replicates for each cell type. Both repli-
5These authors contributed equally to this work.
Article and supplemental material are at http://www.genome.org/cgi/doi/
10.1101/gr.136101.111. Freely available online through the Genome Research
Open Access option.
1680 Genome Research
22:1680–1688 ? 2012, Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/12; www.genome.org
cates were of high enrichment and exhibited high concordance
(average correlation of 0.93) (Supplemental Fig. S1). We found that
(Supplemental Fig. S1A). In total, we identified 77,811 distinct
binding sites across all 19 cell types.
assessed how many cell types demonstrated binding at each site
(see Methods). In all 19 cell types, 27,662 binding sites were pres-
least one cell type (Supplemental Table S2). Thus, 64% of CTCF
sites are found to vary in at least one cell type, demonstrating the
existence of a widespread variability in CTCF occupancy. These
variable sites exhibited clear occupancy differences between
bound and unbound cell types, including at the well-known H19/
IGF2 imprinted locus (Fig. 1A–C). Variable binding sites were oc-
shared regulation between cell types (Fig. 1D). Indeed, between
any two cell types, an average of 72% of bound sites were in
common (Supplemental Fig. S3). Variable sites had a similar ge-
nomic localization (Fig. 1E) compared with constitutive sites.
Distinct CTCF binding landscapes in normal vs. immortal cells
To understand whether binding variability follows a similar pat-
tern in related cell types, we performed an unsupervised hierar-
chical clustering of variable CTCF binding sites (see Methods). We
found that the variable CTCF binding landscape distinguished
three groups (Fig. 2A). The first group of immortal cells consists of
malignancy-derived and EBV-immortalized cell lines, including
several carcinomas (colorectal, Caco-2; cervical, HeLa-S3; hepato-
(WERI-RB-1) and EBV-transformed lymphoplastoid (GM06990).
The remaining two groups consist of normal cell types of limited
proliferative potential: The second group consists of three epithe-
lial cell types, including renal cortical (HRE), small airway (SAEC),
and esophageal (HEEpiC) mucosal epithelia, and the third group
consists of fibroblasts, including abdominal (AG10803), toe skin
mammary (HMF), pulmonary artery (HPAF), and pulmonary (HPF)
and brain microvascular endothelium (HBMEC). Principal compo-
nent analysis and bootstrap assessment of the uncertainty in the
hierarchical clustering confirmed a separation between the normal
cell types and remaining cell lines, although the epithelial line HRE
human cell types. Note the total silencing in two cell lines of the seven CTCF sites in the differentially methylated region (DMR; yellow box at left), and the
complex pattern of cell-selective CTCF binding flanked by constitutive sites. Location (hg19), chr11:2,015,000–2,184,000. (B,C) Additional examples of
variable sites. (D) Genome-wide analysis of CTCF binding in 19 cell types reveals 77,811 distinct binding sites; 27,662 sites are constitutively present in all
cell types; 50,149 variable sites exhibiting a wide range of selectivity are present in a subset of one to 18 cell types (below). (E) Genomic distribution of
variable sites is similar to constitutive sites (Supplemental Fig. S2A).
CTCF in vivo binding exhibits widespread plasticity. (A–C) Constitutive and variable CTCF sites. (A) The H19/IGF2 imprinted locus in multiple
Cell-selective CTCF occupancy landscapes
to identify the specific binding differences characterizing these
three groups. We identified 4146 specific binding sites whose oc-
cupancy was significantly different between these groups at a
false-discovery rate (FDR) of 1% (Methods) (Fig. 2B; Supplemental
regulatory differences distinguishing immortal cell lines from nor-
mal epithelium, endothelium, and fibroblasts.
Variable CTCF occupancy linked to CpG methylation
Pre-existing methylation can antagonize CTCF binding in vitro
(Bell and Felsenfeld 2000; Hark et al. 2000; Kanduri et al. 2000).
Therefore we asked whether differential methylation was associ-
ated with variable sites in vivo. To study this, we compared CTCF
occupancy and RRBS data (Fig. 3A). We studied a subset of CTCF
sites in 13 cell types (n = 6,707) for which RRBS data were
available from the ENCODE project (KE Varley, J Gertz, KM
Bowling, SL Parker, TE Reddy, F Pauli, MK Cross, BAWilliams, JA
Stamatoyannopoulos, GE Crawford, et al., in prep.). We obtained
methylation status of 44,048 CpGs dinucleotides in the region cen-
tered on these sites (see Methods), with each CpG monitored in an
average of 12 out of 13 cell types (Supplemental Fig. S6).
First, we assessed the overall methylation status at the 6707
CTCF sites with RRBS data. We found that methylation was sub-
stantially more variable at variable CTCF sites than at constitutive
intermediate methylation status (between 25% and 75% methyl-
ation) (Supplemental Fig. S6). Overall, 98% of CTCF sites were un-
methylated (definedas<50% methylation)inat least oneofthecell
types tested, confirming an inverse relationship between methyla-
tion and CTCF occupancy. However, 47% of CTCF sites were
methylated (>50% methylation) in at least one cell type, suggest-
ing a widespread potential link between methylation and CTCF
To quantify the global association of differential methylation
status with variable CTCF occupancy, we performed a linear re-
gression analysis at the 6707 sites for which we had RRBS data
(Fig. 3B; see Methods). Four thousand ninety-nine (61%) of these
the 4099 variable sites with RRBS data, 1677 (41%) showed a sig-
nificant association (5% FDR) between methylation and occu-
pancy (Fig. 3C). At significant sites, increased methylation was
negatively associated with occupancy in 98% of cases. The mag-
strong: Occupancywas on average 87% lower at significant sites in
the methylated cell types relative to the unmethylated cell types
(Fig. 3D). Further supporting a strong link to methylation, 67% of
variable methylation was associated with a concomitant affect on
occupancy. The remaining 36% of sites with variable methylation
that was not associated with occupancy nevertheless demon-
strated an aggregate reduction in occupancy in methylated cell
types (Supplemental Fig. S7), confirming the overall inverse asso-
ciation of methylation with CTCF occupancy but suggesting that
this relationship may be complicated by additional factors at this
subset of sites.
Next we asked if the inverse relationship between methyla-
tion and CTCF occupancy is characterized by regional hyper-
4146 variable binding sites that distinguish immortal cell lines, epithelia, fibroblasts and endothelia (Methods). x-axis, CTCF binding sites in chromosomal
order, separated into sites that are up-regulated and down-regulated (arrows) in each of the three groups (immortal, epithelial, fibroblast, and endo-
thelial). Color corresponds to Z-score of normalized ChIP-seq density.
CTCF occupancy distinguishes similar cell types. (A) Unsupervised hierarchical clustering of binding at all CTCF sites. (B) CTCF occupancy at
Wang et al.
methylation or if instead methylation is concentrated specifically
at the region of protein–DNA interaction. We examined the loca-
tion of all CpG dinucleotides relative to the CTCF motif at sites
associated with occupancy differences showed an enrichment of
CpG dinucleotides at two positions in the CTCF recognition se-
quence (Fig. 4). This finding is consistent with previous reports
showing methylation outside the recognition sequence does not
affect CTCF binding in vitro (Engel et al. 2004; Chadwick 2008).
Within the recognition sequence, methylation at one of these
CpGs (position 1) has been shown to inhibit binding of CTCF in
vitro (Renda et al. 2007). The second (position 11) is the pre-
dominant CpG in the motif, which has been shown to have
a higher rate of C–T transitions at vertebrate-conserved binding
sites (Kim et al. 2007), consistent with germline methylation. In-
terestingly, constitutively unmethylated CTCF sites also showed
an enrichment of CpGs at these two positions compared with
differentially methylated sites without an association to occupancy
(Supplemental Fig. S8). Given that the latter sites nevertheless
exhibit substantial methylation variability, this suggests that the
absence of CpGsat these positionsmay decouple CTCF occupancy
from differential methylation at these sites. Overall, 29% of CTCF
recognition sequences genome-wide contain a CpG at positions 1
and/or 11, and 52% of recognition sequences contain a CpG any-
where in the sequence. The genome-wide prevalence of ‘‘suscepti-
ble’’ CTCF sites suggests a widespread potential for interaction
between CTCF and methylation.
Methylated-associated remodeling of CTCF binding
in immortal cell lines
Parallelingprior reportsof widespreadhypermethylationin cancer
(Jones and Baylin 2007; KE Varley, J Gertz, KM Bowling, SL Parker,
creases as local CpG methylation decreases (below). Green indicates CpG is 0% methylated; yellow, 50%; and red, 100%. (B) Quantitative analysis of
cell-typeselective patterns ofmethylationalso exhibited differences in occupancy. (D)Atmethylated binding sites,occupancy wasreducedon averageby
87% compared with cell lines without methylation at the same site. Shown are sites where increased methylation was associated with decreased oc-
cupancy (98% of all significant sites).
Impact of DNA methylation on cell-selective CTCF binding. (A) Example CTCF binding sites, where occupancy (above) quantitatively in-
Cell-selective CTCF occupancy landscapes
GE Crawford, et al., in prep.), we observed a bimodal pattern of
methylation at CTCF sites distinguishing normal and immortal
cell types (Fig. 5A). At 31% of the sites where differential methyl-
ation was associated with CTCF occupancy, methylation was ob-
served throughout the 13 normal and immortal cell types (average
number of methylated cell types, 7.3). In contrast, the remaining
69% of sites were characterized by cell-specific hypermethylation
constrained to the six immortal lines (average number of meth-
ylated cell lines, 2.1) (Fig. 5A, strip at right). Notably, although the
neuroblastoma line SK-N-SH_RA clusters with epithelial cell types
based purely on CTCF binding (Fig. 2A), it exhibits the hyper-
methylation characteristic of the other immortal lines. Surpris-
ingly, the increased methylation in immortal lines does not cor-
5B). Strikingly, we also observed that CTCF transcript levels are
significantly higher in the immortal cell lines (Fig. 5C). This dis-
ruption of CTCF binding in immortal cell lines is further distin-
guished by a unique association between CTCF occupancy and
methylation at promoter sites. Of the promoter CTCF sites where
methylation was significantly associated with occupancy, 98%
(281 of 288) of these sites were characterized by hyper-
methylation in the immortal lines (Fig. 5D). These results suggest
a widespread methylation-associated remodeling of the CTCF
binding landscape in immortal cell lines.
Surprising plasticity of the CTCF occupancy landscape
This study exposes a previously unappreciated degree of plasticity
within the binding landscape of the master genomic regulator
CTCF. Previous studies in a small number of cell types had un-
et al. 2008; Heintzman et al. 2009). We further associate differential
methylation with 41% of this variable binding at a subset of sites
overlapping existing RRBS data in 13 cell types. We specifically
linkedthisvariablemethylationtothepresence ofa CpGattwokey
positions relative to the consensus motif. Finally, we observe the
maintenance of a stable total amount of CTCF genomic binding
sites in immortal cell lines despite their altered localization as-
that methylation is indeed a global feature of the regulatory di-
versity of CTCF, and our approach is readily extensible to the rep-
ertoire of vertebrate transcription factors.
Methylation-associated disruption of CTCF binding
in immortal lines
Although CTCF binding varied across all 13 cell types, we observed
unique patterns of CTCF occupancy specific to the immortal cell
lines. Interestingly, CTCF overexpression has previously been as-
sociated with resistance to apoptosis in breast cancer cell lines
(Docquier et al. 2005) and with DNMT3B overexpression (Butcher
et al. 2004). Further, the unique occurrence of hypermethylation-
associated abrogation of CTCF occupancy at promoters in immortal
linesisnotable,giventhe involvementofCTCFinthe methylation-
associated silencing of known tumor suppressors and oncogenes
Targa 2010). We found that the immortal cell lines we profiled have
the same overall amount of genomic CTCF binding sites despite
a redistribution of CTCF occupancy from binding sites subject to
hypermethylation. The concomitant up-regulation of CTCF ex-
pression may therefore represent a cancer-associated compensa-
tory mechanism. This inverse correlation is compatible with the
existence of a stabilizing mechanism acting through increased
CTCF expression to maintain a constant level of genomic binding
despite increased methylation at its target sites, although further
study in an expanded set of cell types will be necessary.
The role of DNA methylation in regulation of transcription
Although DNA methylation is widely invoked as a causal mecha-
nism for transcriptional repression, surprisingly little in vivo evi-
dence is available. While experimentally directed methylation can
prevent binding of CTCF and other factors in vitro (Tate and Bird
1993; Renda et al. 2007), the mechanisms establishing methyla-
tion patterns in vivo remain unknown, and its precise relationship
with gene expression remains unclear (Enver et al. 1988; Selker
1990; Walsh and Bestor 1999). Likewise, our results do not dis-
tinguish whether demethylation facilitates subsequent CTCF bind-
ing or whether bound CTCF maintains an unmethylated domain.
An alternative model has DNA methylation deposited passively
in the wake of independent abrogation of transcription factor bind-
factor binding sites appear to be generally depleted for DNA meth-
ylation(Mukhopadhyay etal. 2004; Lister et al.2009;Thurman etal.
2012) and that binding sites recognized by certain sequence-specific
factors have been associated with lack of methylation (Straussman
et al. 2009; Dickson et al. 2010; Gebhard et al. 2010; Lienert et al.
2011). Indeed, there is evidence that the binding of some transcrip-
tion factors, including CTCF, is sufficient to effect a local demethyl-
ated state (Matsuo et al. 1998; Lin et al. 2000; Stadler et al. 2011). But
if in vivo methylation was deposited generally at unoccupied bind-
ing sites, then how would this process interact with the in vitro
methylation sensitivity of common transcription factors?
The well-investigated H19/Igf2 imprinted locus offers an ap-
propriate example: CTCF binding there has been shown necessary
to maintain an existing unmethylated state (Schoenherr et al.
2002; Pant et al. 2004). However, CTCF is not the originator of
the unmethylated state (Matsuzaki et al. 2010), implying a limited
CpGs at two positions. Frequency of a CpG (y-axis) at positions relative to
the CTCF motif (x-axis) is shown for sites with variable methylation that is
associated (red) and is not associated (gray) with occupancy changes.
Note that at positions 1 and 11, there is a 2.2- and 1.8-fold enrichment,
respectively, for the presence of a CpG at sites where the variable meth-
ylation was not associated with occupancy. Twenty-nine percent of CTCF
motifs genome-wide contain a CpG at one or both of these positions.
Sites significantly affected by methylation are enriched for
Wang et al.
acts as a cooperative switch to prevent the return of CTCF after
a reprogramming event. In this model, rather than guiding bind-
ing localization, methylation is a general amplifier of perturba-
tions to transcription factor occupancy.
Other sources of variable CTCF binding
Although we have shown that 41% of overall CTCF occupancy
variation is significantly linked to methylation at tested sites, 36%
of variable CTCF sites overlap no variable methylation at all. It
is unlikely that much of this variability is associated with genetic
variability in CTCF recognition sequences (Maurano et al. 2012),
though some sites may associate with modified forms of CTCF
(Klenova et al. 2001; Yu et al. 2004; MacPherson et al. 2009). One
likely possibility is that the constantly unmethylated variable CTCF
sites may represent instances of cooperative regulation that com-
plicate a direct relationship between methylation and CTCF oc-
cupancy. Accordingly, CTCF has been known to interact with
a number of cofactors that could poten-
tially govern its selectivity at these sites
or, alternatively, maintain demethyla-
tion in the absence of CTCF binding
(Chernukhin et al. 2007; Donohoe et al.
2007, 2009; Parelho et al. 2008; Rubio
et al. 2008; Wendt et al. 2008; Ohlsson
et al. 2010; Liu et al. 2011). Interestingly,
we found that of the 36% of variable sites
despite constant methylation, 76% were
within 2.5 kb of a RefSeq transcription
start site, compared with 38% of the var-
iable sites associated with methylation
differences. Recent work has further ob-
served an enrichment of tethered CTCF
peaks at promoters (Neph et al. 2012b),
suggesting that the remaining variation
in CTCF occupancy may derive from
complex regulation of co-factors or vari-
ation in its specific interaction partners.
Given the breadth of CTCF’s regulatory
functionality, our observation of global
binding variation implies a widespread
potential role in the translation of epige-
netic marks to genome organization at
thousands of sites.
Cells were cultured in an appropriate
growth medium, with the addition of
growth factors and supplements accord-
ing to the suppliers’ instructions (Supple-
mental Table S1). Cell lines were main-
tained in a humidified incubator at 37°C
in the presence of 5% CO2.
Suspension cells were cross-linked with
formaldehyde (Sigma) at a final concen-
tration of 1% for 10 min at room tem-
perature. Adherent cells were first de-
tached from the plates by 0.05% Trypsin-EDTA and Trypsin
neutralizer solution (Invitrogen) and then cross-linked by 1%
formaldehyde. Glycine was added to a final concentration of
saline, lysed in lysis buffer (50 mM Tris-HCl at pH 8.0, 10 mM
EDTA, 1% SDS) containing protease inhibitor cocktail (Roche),
and sheared by Bioruptor (Diagenode). The chromatin was in-
cubated with Dynabeads (M-280, sheep anti-rabbit IgG, Invi-
trogen)-conjugated anti-CTCF polyclonal antibody (Cell Signaling
The CTCF–DNA complexes were washed, eluted, and reverse
cross-linked. The DNA was RNase A–, Proteinase K–treated, and
purified by phenol-chloroform-isoamyl alcohol extraction and
ethanol precipitation. DNA was end-repaired (End-it DNA End-
repair kit, Epicentre), followed by the addition of adenine to the 39
ends (Taq DNA polymerase, NEB), and ligated to an adapter (Illu-
mina). Purified ligation product was PCR amplified and run on
a 2% agarose gel. The size-selected libraries were sequenced on an
Illumina Genome Analyzer (Illumina) by the High-Throughput
ation status at 1969 CTCF sites where differential methylation is significantly associated with occupancy
differences. Color corresponds to the percentage of bisulfite sequencing tags at each site overlapping
methylated CpG positions. Dendrogram (left) highlights pattern of hypermethylation in immortal cell
lines. (Right) Smoothed plot of number of immortal lines exhibiting hypermethylation at each site. (B)
Immortal lines show no significant difference in number of occupied CTCF sites (y-axis, mean). Error
bars, SD. (C) immortal lines demonstrate increased CTCF transcript levels (y-axis, mean). Error bars, SD.
(D) Immortal lines exhibit increased methylation relative to the other cell types, though significant
promoter methylation is rarely observed in normal lines. y-axis, genome-wide median of per-site
methylation. P-values, Wilcoxon. Promoter, 62.5 kb of RefSeq transcription start site.
Cell-selective patterns of methylation associated with occupancy differences. (A) Methyl-
Cell-selective CTCF occupancy landscapes
Genomics Center (University of Washington) according to a stan-
For each cell type, experiments were conducted on two in-
dependent biological replicates.
Identification and quantification of CTCF binding sites
We obtained Uniform Element Calls from the ENCODE project for
each cell line. Briefly, peaks were called using SPP (Kharchenko
et al. 2008). The set of peaks reproducible in both replicates were
identified based on an irreproducible discovery rate(IDR) of 0.25%
(Li et al. 2011). We then combined peak calls from 19 cell types to
generate a master list of all distinct CTCF binding sites. We ad-
justed the peaklocationsto centeron matchesto the nearestCTCF
motif (P < 10?5, fimo) if the motif was within 50 bp.
To distinguish between variable and constitutive binding
sites,foreachsite weexaminedthepresence ofapeakineachof19
cell types. We used the peak calling program Hotspot (John et al.
2011) to enable a conservative procedure for the identification of
variable binding sites. To reduce the misclassification of sites near
the peak-calling threshold as variable, we employed separate cut-
offs for calling peak presence and absence. First, for each CTCF
binding site called above, we additionally required that it overlap
a binding site was counted as occupied in subsequent cell lines if
a looser 1% FDR hotspot was present in one or both replicates for
that cell line. Employing this looser criteria for binding in sub-
sequent cell types results in conservative identification of variable
sites. We confirmed that binding sites in cell types considered
absent were substantially closer to background that sites in cell
types considered active (Supplemental Fig. S2B).
ChIP-seq data were mappedto the human genome(GRCh37/
hg19) using bowtie (Langmead et al. 2009) with the options
‘‘bowtie–mm -n 3 -v 3 -k 2–phred64-quals,’’ allowing up to three
mismatches. Reads mapping to multiple locations were then ex-
cluded, and reads with identical 59 ends and strand were presumed
to be PCR duplicates and were excluded. Smoothed density tracks
to count the number of tags overlapping a sliding 150-bp window,
with a step width of 20 bp (Neph et al. 2012a). Density tracks were
normalized for sequencing depth by a global linear scaling to 10
million tags. We measured occupancy by the maximum normal-
ized ChIP-seq tag density over the 134-bp region.
Reproducibility of ChIP-seq experiments was tested using
Pearson correlation on normalized density tracks of chromosome
19 between each replicate.
Clustering of cell-selective CTCF binding sites
We converted the presence and absence of a given peak to 1 and 0,
respectively, in 19 cell lines. We then performed hierarchical
clustering with the hclust function in R, using the ‘‘average’’
method and Euclidean distance metric. We cut the dendrogram
(Fig. 2A) into three groups, of immortal cell lines, epithelia, and
the R package pvclust (Suzuki and Shimodaira 2006) and principal
components analysis (Supplemental Fig. S4). We then used the
package DESeq (Anders and Huber 2010) on the tag count at each
peak to identify differentially occupied sites between each of these
three groups (FDR 1%).
RRBS genome-wide methylation profiling
We downloaded RRBS methylation data for 13 cell lines from the
TE Reddy, F Pauli, MK Cross, BA Williams, JA Stamatoyannopoulos,
GE Crawford, et al., in prep.) of the UCSC Genome Browser. To
measure methylation in each cell line, we combined counts for both
strands in both replicates and removed data for samples with less
than 83 coverage. We retained only CpGs monitored in at least six
samples (Supplemental Fig. S6B).
We applied a linear regression to measure whether methyla-
tion status is associated with occupancy. We normalized CTCF
occupancies using thegetVarianceStabilizedDatafunction ofDESeq
and then averaged replicate signals. We regressed CTCF occupancy
onto the average proportion methylated of all monitored CpGs in
a 134-bp region centered around the CTCF peak. We excluded 1806
sites missing RRBS data and ChIP-seq data for seven or more cell
types or having too great a difference in the number of CpGs mon-
itored between any two cell types (more than six CpGs monitored).
We averaged the methylation level of all CpGs within a 134-bp
window to increase sensitivity and reliability. We excluded sites
where the number of monitored CpGs differed by more than four
among any two cell lines. We used the R package qvalue to estimate
an FDR (Storey and Tibshirani 2003).
RNA expression analysis
For each cell line, total RNA was extracted in two replicates from
5 3 106cells using Ribopure (Ambion) according to the manu-
facturer’s instructions. RNA quality was ascertained using RNA
6000 Nano Chips on a bioanalyzer (Agilent). Approximately 3 mg
of total RNA for each sample was used for labeling and hybridiza-
tion (University of Washington Center for Array Technology) to
Affymetrix Human Exon 1.0 ST arrays (Affymetrix) using a stan-
dard protocol. Exon expression data were analyzed through Affy-
metrix Expression Console using gene-level RMA summarization
and sketch-quantile normalization method. Measurements from
both replicates were then averaged.
CTCF ChIP-seq data have been submitted to the NCBI Gene Ex-
accession no. GSE30263. Affymetrix exon array data are available
under accession no. GSE19090. RRBS methylation data are under
accession no. GSE27584. All three sets are available for viewing in
the UCSC Genome Browser (http://genome.ucsc.edu/).
We thank Jeff Vierstra, Andrew Stergachis, and Sam John for crit-
ical reading of the manuscript and many helpful suggestions. We
also thank Daniel Bates, Morgan Diegel, and Doug Dunn at the
University of Washington High-Throughput Genomics Center
for technical assistance. This work was supported by National
Institutes of Health grants U54HG004592 (J.A.S.) and U54HG0
Author contributions: H.W., M.T.M., and J.A.S. conceived the
study. H.W. and T.C. cultured cells. H.W. and K.L. produced ChIP-seq
data. H.W. and M.W. generated Illumina libraries. K.E.V., J.G., and F.P.
generated RRBS data under the supervision of R.M.M. M.T.M., R.S.,
and R.E.T. processed data. M.T.M. and H.Q. analyzed data. R.K. over-
saw production data collection and aspects of primary analysis. H.W.
and M.T.M. wrote the manuscript, with contributions from J.A.S.
Anders S, Huber W. 2010. Differential expression analysis for sequence
count data. Genome Biol 11: R106. doi: 10.1186/gb-2010-11-10-r106.
Wang et al.
Baniahmad A, Steiner C, Ko ¨hne AC, Renkawitz R. 1990. Modular structure
of a chicken lysozyme silencer: Involvement of an unusual thyroid
hormone receptor binding site. Cell 61: 505–514.
Bell AC, Felsenfeld G. 2000. Methylation of a CTCF-dependent
boundary controls imprinted expression of the Igf2 gene. Nature 405:
Butcher DT, Mancini-DiNardo DN, Archer TK, Rodenhiser DI. 2004. DNA
binding sites for putative methylation boundaries in the unmethylated
region of the BRCA1 promoter. Int J Cancer 111: 669–678.
Chadwick BP. 2008. DXZ4 chromatin adopts an opposing conformation to
that of the surrounding chromosome and acquires a novel inactive
Chernukhin I, Shamsuddin S, Kang SY, Bergstrom R, Kwon YW, Yu W,
Whitehead J, Mukhopadhyay R, Docquier F, Farrar D, et al. 2007.
CTCF interacts with and recruits the largest subunit of RNA
polymerase II to CTCF target sites genome-wide. Mol Cell Biol 27:
Cuddapah S, Jothi R, Schones DE, Roh TY, Cui K, Zhao K. 2008. Global
analysis of the insulator binding protein CTCF in chromatin barrier
regions reveals demarcation of active and repressive domains. Genome
Res 19: 24–32.
Da ´valos-SalasM,Furlan-MagarilM,Gonza ´lez-Buendı ´aE,Valdes-Quezada C,
Ayala-Ortega E, Recillas-Targa F. 2011. Gain of DNA methylation is
enhanced in the absence of CTCF at the human retinoblastoma gene
promoter. BMC Cancer 11: 232. doi: 10.1186/1471-2407-11-232.
Dickson J, Gowher H, Strogantsev R, Gaszner M, Hair A, Felsenfeld G, West
AG. 2010. VEZF1 elements mediate protection from DNA methylation.
PLoS Genet 6: e1000804. doi: 10.1371/journal.pgen.1000804.
Docquier F, Farrar D, D’Arcy V, Chernukhin I, Robinson AF, Loukinov D,
Vatolin S, Pack S, Mackay A, Harris RA, et al. 2005. Heightened
expression of CTCF in breast cancer cells is associated with resistance to
apoptosis. Cancer Res 65: 5112–5122.
Donohoe ME, Zhang L-F, Xu N, Shi Y, Lee JT. 2007. Identification of a Ctcf
cofactor, Yy1, for the X chromosome binary switch. Mol Cell 25: 43–56.
Donohoe ME, Silva SS, Pinter SF, Xu N, Lee JT. 2009. The pluripotency factor
Oct4 interacts with Ctcf and also controls X-chromosome pairing and
counting. Nature 460: 128–132.
Engel N, West AG, Felsenfeld G, Bartolomei MS. 2004. Antagonism between
DNA hypermethylation and enhancer-blocking activity at the H19
DMD is uncovered by CpG mutations. Nat Genet 36: 883–888.
Enver T, Zhang JW, Papayannopoulou T, Stamatoyannopoulos G. 1988.
DNA methylation: A secondary event in globin gene switching? Genes
Dev 2: 698–706.
Filippova GN, Fagerlie S, Klenova EM, Myers C, Dehner Y, Goodwin G,
Neiman PE, Collins SJ, Lobanenkov VV. 1996. An exceptionally
conserved transcriptional repressor, CTCF, employs different
combinations of zinc fingers to bind diverged promoter sequences of
avian and mammalian c-myc oncogenes. Mol Cell Biol 16: 2802–2813.
Filippova GN, Lindblom A, Meincke LJ, Klenova EM, Neiman PE, Collins SJ,
Doggett NA, Lobanenkov VV. 1998. A widely expressed transcription
factor with multiple DNA sequence specificity, CTCF, is localized at
chromosome segment 16q22.1 within one of the smallest regions of
overlap for common deletions in breast and prostate cancers. Genes
Chromosomes Cancer 22: 26–36.
Filippova GN, Thienes CP, Penn BH, Cho DH, Hu YJ, Moore JM, Klesert TR,
Lobanenkov VV, Tapscott SJ. 2001. CTCF-binding sites flank CTG/CAG
Genet 28: 335–343.
Gebhard C, Benner C, Ehrich M, Schwarzfischer L, Schilling E, Klug M,
Dietmaier W, Thiede C, Holler E, Andreesen R, et al. 2010. General
transcription factor binding at CpG islands in normal cells correlates
with resistance to de novo DNA methylation in cancer cells. Cancer Res
Hark AT, Schoenherr CJ, Katz DJ, Ingram RS, Levorse JM, Tilghman SM.
2000. CTCF mediates methylation-sensitive enhancer-blocking activity
at the H19/Igf2 locus. Nature 405: 486–489.
Heintzman ND, Hon GC, Hawkins RD, Kheradpour P, Stark A, Harp LF, Ye Z,
Lee LK, Stuart RK, Ching CW, et al. 2009. Histone modifications at
human enhancers reflect global cell-type-specific gene expression.
Nature 459: 108–112.
Hou C, Zhao H, Tanimoto K, Dean A. 2008. CTCF-dependent enhancer-
blocking by alternative chromatin loop formation. Proc Natl Acad Sci
John S, Sabo PJ, Thurman RE, Sung M-H, Biddie SC, Johnson TA, Hager GL,
Stamatoyannopoulos JA. 2011. Chromatin accessibility pre-
determines glucocorticoid receptor binding patterns. Nat Genet 43:
Jones PA, Baylin SB. 2007. The epigenomics of cancer. Cell 128: 683–692.
Kanduri C, Pant V, Loukinov D, Pugacheva E, Qi CF, Wolffe A, Ohlsson R,
Lobanenkov VV. 2000. Functional association of CTCF with the
insulator upstream of the H19 gene is parent of origin-specific and
methylation-sensitive. Curr Biol 10: 853–856.
seq experiments for DNA-binding proteins. Nat Biotechnol 26: 1351–
MQ, Lobanenkov VV, Ren B. 2007. Analysis of the vertebrate insulator
protein CTCF-binding sites in the human genome. Cell 128: 1231–
Klenova EM, Chernukhin IV, El-Kady A, Lee RE, Pugacheva EM, Loukinov
DI, Goodwin GH, Delgado D, Filippova GN, Leon J, et al. 2001.
Functional phosphorylation sites in the C-terminal region of the
multivalent multifunctional transcriptional factor CTCF. Mol Cell Biol
Lai AY, Fatemi M, Dhasarathy A, Malone C, Sobol SE, Geigerman C, Jaye DL,
Mav D, Shah R, Li L, et al. 2010. DNA methylation prevents CTCF-
mediated silencing of the oncogene BCL6 in B cell lymphomas. J Exp
Med 207: 1939–1950.
Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-
efficient alignment of short DNA sequences to the human genome.
Genome Biol 10: R25. doi: 10.1186/gb-2009-10-3-r25.
Lefevre P, Witham J, Lacroix CE, Cockerill PN, Bonifer C. 2008. The LPS-
induced transcriptional upregulation of the chicken lysozyme locus
involves CTCF eviction and noncoding RNA transcription. Mol Cell 32:
LiQ,BrownJB, HuangH,BickelPJ. 2011. Measuring reproducibility ofhigh-
throughput experiments. Ann Appl Stat 5: 1752–1779.
Lienert F, Wirbelauer C, Som I, Dean A, Mohn F, Schu ¨beler D. 2011.
Identification of genetic elements that autonomously determine DNA
methylation states. Nat Genet 43: 1091–1097.
Lin IG, Tomzynski TJ, Ou Q, Hsieh CL. 2000. Modulation of DNA binding
protein affinity directly affects target site demethylation. Mol Cell Biol
ListerR,Pelizzola M,Dowen RH,HawkinsRD,HonG,Tonti-FilippiniJ,Nery
JR, Lee L, Ye Z, Ngo Q-M, et al. 2009. Human DNA methylomes at base
resolution show widespread epigenomic differences. Nature 462: 315–
Liu Z, Scannell DR, Eisen MB, Tjian R. 2011. Control of embryonic stem
cell lineage commitment by core promoter factor, TAF3. Cell 146:
MacPherson MJ, Beatty LG, Zhou W, Du M, Sadowski PD. 2009. The CTCF
insulator protein is posttranslationally modified by SUMO. Mol Cell Biol
Matsuo K, Silke J, Georgiev O, Marti P, Giovannini N, Rungger D. 1998. An
embryonic demethylation mechanism involving binding of
transcription factors to replicating DNA. EMBO J 17: 1446–1453.
Matsuzaki H, Okamura E, Fukamizu A, Tanimoto K. 2010. CTCF binding
is not the epigenetic mark that establishes post-fertilization
methylation imprinting in the transgenic H19 ICR. Hum Mol Genet
Maurano MT, Wang H, Kutyavin T, Stamatoyannopoulos JA. 2012.
Widespread site-dependent buffering of human regulatory
polymorphism. PLoS Genet 8: e1002599. doi: 10.1371/journal.pgen.
Mukhopadhyay R, Yu W, Whitehead J, Xu J, Lezcano M, Pack S, Kanduri C,
Kanduri M, Ginjala V, Vostrov A, et al. 2004. The binding sites for the
chromatin insulator protein CTCF map to DNA methylation-free
domains genome-wide. Genome Res 14: 1594–1602.
Neph S, Kuehn MS, Reynolds AP, Haugen E, Thurman RE, Johnson AK,
Rynes E, Maurano MT, Vierstra J, Thomas S, et al. 2012a. BEDOPS: High
performance genomic feature operations. Bioinformatics doi: 10.1093/
RE, Sandstrom R, Johnson AK, Maurano MT, et al. 2012b. An expansive
human regulatory lexicon encoded in transcription factor footprints.
Nature (in press).
Ohlsson R, Lobanenkov V, Klenova E. 2010. Does CTCF mediate between
nuclear organization and gene expression? Bioessays 32: 37–50.
Pant V, Mariano P, Kanduri C, Mattsson A, Lobanenkov V, Heuchel R,
Ohlsson R. 2003. The nucleotides responsible for the direct physical
contact between the chromatin insulator protein CTCF and the H19
imprinting control region manifest parent of origin-specific long-
distance insulation and methylation-free domains. Genes Dev 17: 586–
Pant V, Kurukuti S, Pugacheva E, Shamsuddin S, Mariano P, Renkawitz R,
Klenova E, Lobanenkov V, Ohlsson R. 2004. Mutation of a single CTCF
target site within the H19 imprinting control region leads to loss of Igf2
imprinting and complexpatterns of de novo methylation upon maternal
inheritance. Mol Cell Biol 24: 3497–3504.
Parelho V, Hadjur S, Spivakov M, Leleu M, Sauer S, Gregson HC, Jarmuz A,
Cell-selective CTCF occupancy landscapes
associate with CTCF on mammalian chromosome arms. Cell 132: 422– Download full-text
Rasko JE, Klenova EM, Leon J, Filippova GN, Loukinov DI, Vatolin S,
Robinson AF, Hu YJ, Ulmer J, Ward MD, et al. 2001. Cell growth
inhibition by the multifunctional multivalent zinc-finger factor CTCF.
Cancer Res 61: 6002–6007.
Renda M, Baglivo I, Burgess-Beusse B, Esposito S, Fattorusso R, Felsenfeld G,
Pedone PV. 2007. Critical DNA binding interactions of the insulator
protein CTCF: A small number of zinc fingers mediate strong binding,
and a single finger-DNA interaction controls binding at imprinted loci.
J Biol Chem 282: 33336–33345.
Rubio ED, Reiss DJ, Welcsh PL, Disteche CM, Filippova GN, Baliga NS,
Aebersold R, Ranish JA, Krumm A. 2008. CTCF physically links cohesin
to chromatin. Proc Natl Acad Sci 105: 8309–8314.
Schoenherr CJ, Levorse JM, Tilghman SM. 2002. CTCF maintains
differential methylation at the Igf2/H19 locus. Nat Genet 33: 66–69.
Seitan VC, Hao B, Tachibana-Konwalski K, Lavagnolli T, Mira-Bontenbal H,
Brown KE, Teng G, Carroll T, Terry A, Horan K, et al. 2011. A role for
cohesin in T-cell-receptor rearrangement and thymocyte
differentiation. Nature 476: 467–471.
Sekimata M, Pe ´rez-Melgosa M, Miller SA, Weinmann AS, Sabo PJ, Sandstrom
R, Dorschner MO, Stamatoyannopoulos JA, Wilson CB. 2009. CCCTC-
binding factor and the transcription factor T-bet orchestrate T helper 1
cell-specific structure and function at the interferon-g locus. Immunity
Selker EU. 1990. DNA methylation and chromatin structure: A view from
below. Trends Biochem Sci 15: 103–107.
Shukla S, Kavak E, Gregory M, Imashimizu M, Shutinoski B, Kashlev M,
Oberdoerffer P, Sandberg R, Oberdoerffer S. 2011. CTCF-promoted RNA
polymerase II pausing links DNA methylation to splicing. Nature 479:
Soto-Reyes E,Recillas-TargaF.2010. Epigenetic regulation of the humanp53
gene promoter by the CTCF transcription factor in transformed cell
lines. Oncogene 29: 2217–2227.
Splinter E. 2006. CTCF mediates long-range chromatin looping and local
histone modification in the b-globin locus. Genes Dev 20: 2349–
Stadler MB, Murr R, Burger L, Ivanek R, Lienert F, Scho ¨ler A, Wirbelauer C,
Oakeley EJ, Gaidatzis D, Tiwari VK, et al. 2011. DNA-binding factors
shape the mouse methylome at distal regulatory regions. Nature 480:
Storey JD, Tibshirani R. 2003. Statistical significance for genomewide
studies. Proc Natl Acad Sci 100: 9440–9445.
Straussman R, Nejman D, Roberts D, Steinfeld I, Blum B, Benvenisty N,
Simon I, Yakhini Z, Cedar H. 2009. Developmental programming of
CpG island methylation profiles in the human genome. Nat Struct Mol
Biol 16: 564–571.
Suzuki R, Shimodaira H. 2006. Pvclust: An R package for assessing the
uncertainty in hierarchical clustering. Bioinformatics 22: 1540–
Tate PH, Bird AP. 1993. Effects of DNA methylation on DNA-binding
proteins and gene expression. Curr Opin Genet Dev 3: 226–231.
Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E,
Sheffield NC, Stergachis AB, Wang H, Vernot B, et al. 2012. The
accessible chromatin landscape of thehuman genome. Nature (in press).
a role in transcriptional activation. J Biol Chem 272: 33353–33359.
Walsh CP, Bestor TH. 1999. Cytosine methylation and mammalian
development. Genes Dev 13: 26–34.
Wendt KS, Yoshida K, Itoh T, Bando M, Koch B, Schirghuber E, Tsutsumi S,
Nagae G, Ishihara K, Mishiro T, et al. 2008. Cohesin mediates
transcriptional insulation by CCCTC-binding factor. Nature 451:
Witcher M, Emerson BM. 2009. Epigenetic silencing of the p16INK4atumor
suppressor is associated with loss of CTCF binding and a chromatin
boundary. Mol Cell 34: 271–284.
Yu W, Ginjala V, Pant V, Chernukhin I, Whitehead J, Docquier F, Farrar D,
Tavoosidana G, Mukhopadhyay R, Kanduri C, et al. 2004. Poly(ADP-
ribosyl)ation regulates CTCF-dependent chromatin insulation. Nat
Genet 36: 1105–1110.
Yusufzai TM, Tagami H, Nakatani Y, Felsenfeld G. 2004. CTCF tethers an
insulator to subnuclear sites, suggesting shared insulator mechanisms
across species. Mol Cell 13: 291–298.
Received December 9, 2011; accepted in revised form April 30, 2012.
Wang et al.