DNA methylation is widespread and associated
with differential gene expression in castes
of the honeybee, Apis mellifera
Navin Elango1, Brendan G. Hunt1, Michael A. D. Goodisman, and Soojin V. Yi2
School of Biology, Georgia Institute of Technology, Atlanta, GA 30332
Edited by Mary Jane West-Eberhard, Smithsonian Tropical Research Institute, Costa Rica, and approved May 14, 2009 (received for review January 12, 2009)
The recent, unexpected discovery of a functional DNA methylation
system in the genome of the social bee Apis mellifera underscores
the potential importance of DNA methylation in invertebrates. The
extent of genomic DNA methylation and its role in A. mellifera
remain unknown, however. Here we show that genes in A. mel-
lifera can be divided into 2 distinct classes, one with low-CpG
dinucleotide content and the other with high-CpG dinucleotide
content. This dichotomy is explained by the gradual depletion of
CpG dinucleotides, a well-known consequence of DNA methyl-
ation. The loss of CpG dinucleotides associated with DNA methyl-
ation also may explain the unusual mutational patterns seen in A.
mellifera that lead to AT-rich regions of the genome. A detailed
investigation of this dichotomy implicates DNA methylation in A.
mellifera development. High-CpG genes, which are predicted to be
hypomethylated in germlines, are enriched with functions associ-
ated with developmental processes, whereas low-CpG genes, pre-
dicted to be hypermethylated in germlines, are enriched with
functions associated with basic biological processes. Furthermore,
genes more highly expressed in one caste than another are over-
represented among high-CpG genes. Our results highlight the
potential significance of epigenetic modifications, such as DNA
methylation, in developmental processes in social insects. In par-
ticular, the pervasiveness of DNA methylation in the genome of A.
mellifera provides fertile ground for future studies of phenotypic
plasticity and genomic imprinting.
comparative genomics ? phenotypic plasticity
methylation of cytosine bases represents an important epigenetic
mark that affects gene expression in diverse taxa (1, 3). Despite the
phylogenetically widespread and ancient origin of DNA methyl-
(2); for example, whereas vertebrate genomes tend to show exten-
reduced or minimal levels of methylation (1, 2, 4). Variation in
genome methylation patterns is of great interest because it suggests
that the role of DNA methylation is not strictly conserved among
species. Thus, information on the nature and extent of DNA
methylation in diverse taxa continues to be a valuable resource for
exploring the role of this DNA modification in eukaryotes (2, 3, 5).
Recent research has identified a functional DNA methylation
system in a social insect, the honeybee, Apis mellifera (6). Social
insects are among the most successful of animal taxa (7, 8). Their
success stems from the cooperative behaviors displayed by society
defining feature of hymenopteran social insects (ants, some bees,
and some wasps) is a reproductive division of labor, whereby
individuals of the queen caste reproduce while members of the
worker caste defend the nest, forage, and rear the young. This
division of individuals into alternate castes represents a key evolu-
tionary transition that allowed social insects to come to dominate
many terrestrial ecosystems (10, 11).
NA methylation occurs in the genomes of a wide array of
bacteria, plants, fungi, and animals (1, 2). In particular, the
Remarkably, DNA methylation appears to be directly associated
with the differentiation of castes in A. mellifera (12, 13). Kucharski
et al. (12) demonstrated that down-regulation of a key DNA
methyltransferase (Dnmt3) in developing A. mellifera larvae re-
sulted in profound changes in caste developmental trajectories.
Accordingly, DNA methylation may represent an important mech-
anism facilitating the evolution of social systems (14).
Despite the potential importance of DNA methylation, the
genome-wide patterns of methylation within the A. mellifera ge-
nome remain poorly understood. This is unfortunate, because
knowledge of the patterns of DNA methylation in the A. mellifera
(6, 12). Moreover, linking molecular changes such as DNA meth-
ylation with the evolution and development of social phenotypes
remains one of the major challenges in understanding sociality (15,
16). In this study, we investigated the nature of DNA methylation
in A. mellifera by analyzing global patterns of methylation using
computational methods and comparing them with experimental
results in A. mellifera and other species. We found that DNA
methylation is widespread and has played a critical role in A.
mellifera genome evolution, and that it is associated with important
developmental processes, including caste formation.
Depletion of CpG Dinucleotides Suggests Widespread Gene Methyl-
ation in A. mellifera. Weused‘‘normalized’’CpGcontent(CpGO/E)
to infer the pattern of DNA methylation in A. mellifera. CpGO/Eis
a robust measure of the level of DNA methylation on an evolu-
tionary time scale due to specific mutational mechanisms of meth-
ylated cytosines (17–20). In brief, methylated cytosines are hyper-
mutable due to their vulnerability to spontaneous deamination,
which causes a gradual depletion of CpG dinucleotides from
methylated regions over time (21). Consequently, genomic regions
that are subject to heavy germline DNA methylation (hypermethy-
lated) lose CpG dinucleotides over time and have lower-than-
expected CpGO/E. In contrast, regions that undergo little germline
DNA methylation (hypomethylated) maintain high CpGO/E. This
measure has been successfully used to indirectly measure historical
methylation inferred from CpGO/Ecorresponds well to the actual
pattern of DNA methylation in such diverse taxa as human and sea
squirt (19, 20).
We first examined the distribution of CpGO/Ein several insect
Author contributions: N.E., B.G.H., M.A.D.G., and S.V.Y. designed research; N.E., B.G.H.,
M.A.D.G., and S.V.Y. performed research; M.A.D.G. and S.V.Y. contributed new reagents/
analytic tools; N.E., B.G.H., M.A.D.G., and S.V.Y. analyzed data; and N.E., B.G.H., M.A.D.G.,
and S.V.Y. wrote the paper.
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
1N.E. and B.G.H contributed equally to this work.
2To whom correspondence should be addressed. E-mail: email@example.com.
This article contains supporting information online at www.pnas.org/cgi/content/full/
July 7, 2009 ?
vol. 106 ?
no. 27 www.pnas.org?cgi?doi?10.1073?pnas.0900301106
genomes. We focused on analyses of genes, because the annotation
of other genomic regions (e.g., intergenic regions and noncoding
functional elements) in insect genomes other than Drosophila
melanogaster is far from complete.
lacks critical DNA methyltransferases (2, 23). Accordingly, the
CpGO/E in D. melanogaster genes has an approximately normal
distribution with a mean around 1 (Fig. 1A). Analyses of other
published insect genomes, including Tribolium castaneum and
Anopheles gambiae, yield similar patterns (Fig. 1 B and C). Thus,
genes in these insects exhibit little evidence of DNA methylation
according to mutational decay of CpG dinucleotides.
In contrast, we find that the CpGO/Eof A. mellifera genes exhibits
a striking bimodal pattern that is best explained by a mixture of 2
distinct distributions (Fig. 1D; see Materials and Methods). The
CpGO/E of approximately half of the A. mellifera genes has a
distribution with a remarkably high mean of 1.50 (SD ? 0.20),
similar to the genomic background (see also ref. 24). Surprisingly,
the other half of the A. mellifera genes have a distinct distribution
with a mean much lower than the genome average (mean ? 0.55,
SD ? 0.20; Fig. 1D). Low CpGO/Eis a signature of DNA methyl-
belonging to the first category as ‘‘high-CpG’’ genes and those
within the latter category as ‘‘low-CpG’’ genes.
is highly heterogeneous (24, 25), we examined whether the ob-
served bimodality in CpG content arises from a bias in nucleotide
composition. Previous studies have shown a positive correlation
GC content and CpG content are strongly correlated in the A.
mellifera genome (Kendall’s correlation coefficient, ? ? 0.32; P ?
10?15). Thus, it is possible that the distribution of CpG content
reflects the influence of GC content. To explore this possibility, we
investigated the distribution of normalized GpC content (GpCO/E).
GpC dinucleotides have the same C and G composition as CpG
this reason, GpCO/E often is used as an indicator of nucleotide
composition bias while controlling for the influence of DNA
methylation (22, 29).
We find that the distribution of GpCO/E in A. mellifera is
unimodal (Fig. S1). The GpCO/Edistribution in D. melanogaster is
unimodal as well, as expected (results not shown). Moreover,
analyses of all other dinucleotides in A. mellifera clearly show that
bimodality is exclusive to CpG dinucleotides (Fig. S1). These
findings indicate that the observed bimodality of CpGO/E in A.
methylation on an evolutionary time scale; hypermethylated genes
exhibit CpG depletion, whereas hypomethylated genes have high
Further support for the link between CpG content and the level
of DNA methylation comes from an analysis of CpGO/Eprofiles of
genes in a distantly related invertebrate, Ciona intestinalis. C.
intestinalis is the only invertebrate whose genomic pattern of DNA
methylation has been experimentally investigated to date (19, 30),
and its CpGO/Elevel has been shown to correspond to the actual
level of DNA methylation (19). Furthermore, A. mellifera genes
shown to be methylated in a previous study (6) are all found in the
low-CpG class, as predicted by the proposed model (Fig. 1D).
specific CpGO/Evalues given on the x-axis.
The distribution of CpGO/E in D. melano-
gaster (A), A. gambiae (B), and T. casta-
tion, reflecting a relative lack of DNA
methylation in these species. In contrast,
the distribution of CpGO/Ein A. mellifera
genes (D) is bimodal, likely demonstrating
the effects of DNA methylation of CpG
position of the 5 genes [GB16767 (CpGO/E?
0.56), GB19399 (0.66), GB18099 (0.67),
GB12504 (0.75), and XP?001121083 (0.71)]
found to be methylated in a previous study
(6). Note that we could not map the gene
GB15223 using our experimental procedure.
Contrasting patterns of DNA
Elango et al. PNAS ?
July 7, 2009 ?
vol. 106 ?
no. 27 ?
To explore whether DNA methylation is widespread in genomic
regions other than genes, we analyzed the CpGO/Edistribution of
the entire A. mellifera genome, as well as putative promoter regions
(500 base pairs or 1,000 base pairs upstream of transcription start
sites), untranslated regions, and transposable elements. Our anal-
yses demonstrate that the strong bimodality of CpGO/Eis unique to
amino acid–encoding sequences [ supporting information (SI) text
and Figs. S2 and S3]. Only coding sequences harbor substantial
portions of the low-CpG class, bearing evolutionary signatures of
that CpG methylation in A. mellifera is found predominantly in
exons (6). The pattern of CpG depletion in A. mellifera introns is
bimodal as well (Fig. S2), suggesting that some introns are meth-
primary targets of DNA methylation (6).
‘‘bimodality’’ of A. mellifera genes, which represents an intragenic
evolutionary signature of methylation, correlates with gene func-
tion; genes found in low-CpG and high-CpG classes are involved in
specific biological processes (31). Specifically, the low-CpG and
high-CpG classes are enriched with distinct Gene Ontology (GO)
categories (Table 1). Low-CpG genes, predicted to be hypermethy-
lated in the germlines, are significantly enriched for terms related
to metabolism and ubiquitous housekeeping functions of gene
expression and translation (Table 1 and Table S1). In contrast,
high-CpG genes, which are predicted to be hypomethylated in the
germlines, exhibit a striking and significant enrichment of terms
associated with various developmental processes, cellular commu-
nication, and adhesion (Table 1 and Table S1).
Enriched in the High-CpG Class. Social insect development is marked
by a remarkable level of phenotypic plasticity. In particular, many
hymenopteran social insect females can develop into distinctive
queen and worker castes from identical genomes. Recent studies
A. mellifera, by silencing crucial genes involved in caste formation
then genes that are overexpressed in a specific caste, or ‘‘caste-
specific’’ genes, may show preferential enrichment in low-CpG or
high-CpG genes. We tested this prediction using a data set from a
recent study that identified differential gene expression in brains of
queens and sterile workers (32).
We first examined whether genes that were identified as caste-
specific (at a 5% significance level) tend to be biased toward a
specific CpG-content class (low-CpG or high-CpG). We find that
caste-specific genes tend to harbor more high-CpG genes than
that are not differently expressed between the castes; Table 2). The
enrichment of high-CpG genes increases with the bias toward
caste-specific expression (Table 2; Fig. 2). Moreover, the degree of
caste-specificity [measured as the absolute value of log2(queen/
worker) gene expression] is significantly positively correlated with
CpGO/E(Spearman’s rank correlation, rs? 0.1405; P ? 2.80e-09;
We further expanded our analyses to genes implicated in A.
mellifera caste differentiation identified by previous studies of gene
expression (33–38). Again we found that caste-specific genes over-
whelmingly belong to the high-CpG class (Table 3). Note that
caste-specific genes are not necessarily those implicated solely in
developmental processes; many of these genes perform basic bio-
logical functions (Table 3).
contrast to that in D. melanogaster, T. castaneum, and A. gambiae
(Fig. 1). In particular, approximately half of A. mellifera genes
belong to a distinctive low-CpG class (Fig. 1D). Given that (i)
only CpG content exhibits bimodal distribution (Fig. S1), and (iii)
deamination of methylated CpGs to TpG (or CpA in the comple-
mentary strand) causes a GC-to-AT mutational bias in diverse taxa
(21, 22), these observations implicate DNA methylation in the
origin of CpG bimodality. As far as we are aware, no other
molecular mechanism is known to influence CpG dinucleotides
exclusively and is unique to the A. mellifera genome compared with
other sequenced insect genomes.
mellifera evolution that may help explain important genome char-
acteristics. For instance, the A. mellifera genome is known for its
Table 1. Distinctive functional enrichment of low-CpG and high-CpG genes
CpG classGO biological process termAccession Fold enrichment in classSignificance*
Macromolecule metabolic process
Cellular metabolic process
Primary metabolic process
Nucleobase, nucleoside, nucleotide, and nucleic acid metabolic process
Biopolymer metabolic process
RNA metabolic process
Multicellular organismal process
Multicellular organismal development
Anatomic structure development
The top 10 significantly enriched terms for low-CpG and high-CpG classes are shown; for a complete list, see Table S1 . GO biological process term enrichment
is based on 1,781 D. melanogaster orthologs of A. mellifera high-CpG genes (1,230 with GO annotation) and 2,531 D. melanogaster orthologs of A. mellifera
low-CpG genes (1,713 with GO annotation).
*Significance is denoted by a Benjamini correction for multiple testing.
www.pnas.org?cgi?doi?10.1073?pnas.0900301106Elango et al.
overall low and heterogeneous distribution of GC content (24, 25).
An earlier study also detected the presence of a mutational bias
genes (25); however, the nature of such a mutational process
remains unknown. Here we show that CpGO/Eexhibits a striking
genes. These observations point to a link between the mutational
bias toward AT and the depletion of CpG dinucleotides resulting
from DNA methylation.
We also propose that, in addition to the mutational bias decreas-
ing CpG content in low-CpG genes, other molecular mechanisms
are operating to increase or maintain CpG content in high-CpG
genes. The CpGO/Eof high-CpG genes is higher than that of other
dinucleotides and exceeds the value of 1.0 expected under random
association of C and G nucleotides (Fig. S1). Thus, a process that
conserves or even increases CpG dinucleotides against mutational
depletion may exist in the honeybee genome, especially in high-
CpG genes. The presence and nature of such processes in the A.
mellifera genome should be addressed in future studies.
We have demonstrated that a substantial number of A. mellifera
genes harbor evolutionary signatures of DNA methylation. This
leads to the question of the functional significance of DNA meth-
ylation in A. mellifera. One potential role of DNA methylation is
genomic imprinting, an epigenetic mechanism through which the
expression of a gene is influenced by the parent from which it is
inherited. In mammalian systems, DNA methylation is implicated
in genomic imprinting (1, 39, 40). Social insects, especially those
belonging to the haplodiploid Hymenoptera (social bees, social
wasps, and ants), provide another intriguing context in which
imprinting may play an important role in mediating a wide array of
behaviors (41–43). We predict that imprinted genes, which should
bear epigenetic marks (i.e., methylation) in the germlines, prefer-
entially belong to the hypermethylated low-CpG class. Because our
results demonstrate that nearly half of A. mellifera genes belong to
the low-CpG class, many genes are candidates for studies of
imprinting in A. mellifera. In this respect, it is of great interest to
note that DNA methylation is widespread in haplodiploid hyme-
nopteran social insects (44). Thus, information on CpG depletion
for specific sets of genes in social insects provides fertile ground for
future imprinting studies in a comparative context.
Our analyses indicate that methylation targets primarily gene
bodies (exons and introns) in the A. mellifera genome. Moreover,
methylated and nonmethylated regions coexist. Such a pattern is
qualitatively similar to that found in echinoderms (e.g., sea urchin)
and urochordates (e.g., sea squirt) (2, 19, 45). In the sea squirt (C.
intestinalis), where genomic methylation has been examined in
detail, it has been proposed that the primary role of DNA meth-
ylation is to suppress spurious transcription of genes that are
(2, 19). Our observation that genes that tend to be methylated are
involved in basic biological processes (Table 1) supports this idea.
We found that low-CpG and high-CpG classes are populated
with genes belonging to distinctive functional categories (Table 1).
Low-CpG genes often are involved in metabolic processes and
nucleotide processing, which can be considered basic biological
processes. In contrast, a high proportion of high-CpG genes, which
Table 2. Caste-specific genes, which are differentially expressed between the queen and worker castes, are significantly
overrepresented in the high-CpG class compared with caste-generic genes, whose expression patterns are not significantly different
between the 2 castes
Gene expression classSignificance threshold* High-CpG classLow-CpG class
CpGO/E, mean ? SEM (median)Wilcoxon P value‡
1.0895 ? 0.0135 (1.0577)
1.1633 ? 0.0149 (1.2094)
1.1837 ? 0.0187 (1.2663)
1.2439 ? 0.0260 (1.3637)
1.3274 ? 0.0352 (1.4042)
P ? .05
P ? .01
P ? .001
P ? .0001
The significance of the tests increases (i.e., P values decrease) as the significance threshold for genes considered caste-specific becomes more stringent.
*Significance threshold for caste-specific genes differentially expressed by queens and sterile workers in a pairwise comparison.
class, after Yates’s correction.
‡P values of Wilcoxon’s rank-sum test with continuity correction from pairwise comparisons of CpGO/Evalues for caste-specific genes versus caste-generic genes.
[measured as the absolute value of
log2(queen/worker) gene expression]
is correlated with CpGO/E(Spearman’s
rank correlation, rs ? 0.1405; P ?
2.80e-09). Mean values of CpGO/Efor
shown as black dots with 95% confi-
dence interval error bars. Ten outliers
beyond caste-specificity values of 1.2
are excluded from the figure, but are
and model fitting. Points in the scat-
terplot are divided into caste-generic
significant differences in expression
The relationship between the values
of log2-gene expression ratios be-
that the enrichment of high-CpG genes holds for genes that are either queen-specific or worker-specific. Genes expressed more highly in workers have
log2-ratios ? 0, whereas those expressed more highly in queens have log2-ratios ? 0. The y-axis shows the mean and 95% confidence intervals of each group.
As the log2-expression ratios between castes become more extreme (either side of the x-axis), CpGO/Etends to become more elevated.
Caste-specific genes tend to
Elango et al. PNAS ?
July 7, 2009 ?
vol. 106 ?
no. 27 ?
are predicted to be hypomethylated, are involved in development.
This finding is particularly intriguing when considered along with
the results of recent studies implicating DNA methylation in the
regulation of phenotypic plasticity in social insects (12, 14).
Interestingly, we found that genes that are overexpressed in a
specific caste are found more frequently in the hypomethylated
class (Tables 2 and 3; Fig. 2). But it is noteworthy that not all
caste-specific genes are found in the high-CpG class (Table 3); for
example, genes associated with metabolism are frequently differ-
entially expressed between castes (46–48) but overrepresented in
the low-CpG class. Thus, the enrichment of caste-specific genes in
the high-CpG class is particularly striking.
Previous studies in A. mellifera also have uncovered associa-
tions among cis-regulatory motifs, social behavior, and caste
development (46, 49), indicating that cis-regulatory elements
represent a putative global control mechanism for caste-specific
gene expression. The significance of cis-regulatory elements,
coupled with the finding that methylation can regulate caste fate
(12), gives rise to the possibility that methylation interacts with
regulatory elements to differentiate developmental pathways.
But methylation of cis-regulatory elements themselves may not
be a major mechanism underlying caste differences in A. mel-
lifera, because our results suggest that methylation is limited
primarily to gene bodies (Fig. S2).
Why are caste-specific genes preferentially found in the high-
CpG class? We hypothesize that high-CpG genes in A. mellifera
generally are more prone to epigenetic modulation than low-CpG
genes. Large-scale analyses of methylation patterns in mammals
repeatedly show that a subset of high-CpG promoters, particularly
those associated with developmental processes, exhibit significant
epigenetic flexibility, meaning that they are methylated in some
tissues or developmental stages but not in others (50, 51). Further-
more, a class of mammalian genes with high-CpG promoters
achieves complex, tissue-specific gene expression via pliable tran-
scriptional regulation (N.E. and S.V.Y., unpublished data). Our
observation that caste-specific genes tend to be enriched in the
high-CpG class agrees with the aforementioned findings in mam-
mals and may share similar underlying molecular mechanisms.
Caste-specific genes must be activated or inactivated based on
environmental input to proceed along different developmental
paths; the high-CpG content of caste-specific genes may facilitate
such modulation, similar to the role played by some high-CpG
promoters in mammalian genomes.
(4). We have found that the genome of A. mellifera can be divided
into 2 distinct classes based on the level of CpG depletion. Several
pieces of evidence suggest that DNA methylation is the causative
mechanism behind the observed bimodality. In particular, our
prediction correctly assigns all genes identified as methylated in a
previous study (6) to the low-CpG class. Our results suggest that
DNA methylation regulates development, as seems to be the case
in numerous other taxa (1, 39, 52, 53). In fact, DNA methylation is
believed to play a critical role in caste differentiation (12). Our
analyses of caste-specific genes provide support for this idea, but
future studies and experimental verification of caste-specific gene
expression and DNA methylation are needed.
The social Hymenoptera are ideal for studying the evolution and
development of phenotypic plasticity, because the order comprises
diverse taxa with multiple independent evolutionary origins of
specialized queen and worker castes (54). The study of A. mellifera
provides an important first look into the genome of a social
hymenopteran insect (24), but the genomes of many social insects
and related species are likely to be sequenced within the next 10
years (55). Comparative genomic analyses of evolutionary methyl-
ation signatures and experimental verification will more fully
elucidate the evolutionary history and functional roles of DNA
methylation in this important group.
Materials and Methods
Genome Sequences and Annotations. Genome sequences and gene annotations
of A. mellifera, A. gambiae, and D. melanogaster were downloaded from the
University of California Santa Cruz genome browser (genome builds apimel2,
anoGam1, and dm3). The genome sequence and gene annotation of T. casta-
neum was downloaded from BeetleBase (www.beetlebase.org). Repetitive ele-
ments were annotated using the RepeatMasker program.
Measurement of CpGO/Eand Tests for Bimodality. CpGO/Eis a metric of depletion
the specific region of interest. The CpGO/Efor each gene is defined as
where PCpG, PC, and PGare the frequencies of CpG dinucleotides, C nucleotides,
and G nucleotides, respectively, estimated from each gene ( Dataset S1). Here a
gene was defined as all exons (both coding sequences and untranslated exons)
The unimodality or bimodality of CpGO/Edistributions was tested using the
NOCOM software package. In brief, this software uses an expectation maximi-
zation algorithm to fit the data to both unimodal and bimodal distribution
models and finds the maximum likelihood values (L0for unimodal models and
and L1for bimodal models). The statistic G2? 2 [ln(L1) ? ln(L0)], which approxi-
mately follows a ?2distribution with 2 degrees of freedom, can be used to test
distribution. The cutoff value between high-CpG genes and low-CpG genes was
determined by plotting curves based on the NOCOM means of 0.55 (SD ? 0.20)
and 1.50 (SD ? 0.20) and determining their point of intersection (1.08; Fig. 1D).
Table 3. Genes identified as caste-specific from previous studies of gene expression and caste development in A. mellifera tend
to belong (23 of 28; P < .005) to the hypomethylated class (high-CpG)
Gene/gene family FunctionCaste-biased expressionCpGO/EclassReference
AmIF-2mttranslation initiation factor
AmILP-2 insulin-like peptide
Translation of mitochondrial-encoded mRNAs
Regulation of growth/metabolism
Higher in queen larvae
Higher in workers than queens from second
Higher in worker adults
AmInR putative insulin-like peptide
amTOR (target of rapamycin)
Regulation of growth/metabolism 2/2 high-CpG(34)
Regulation of growth/metabolismHigher in queen 3rdinstar larvae, but not 5th
instar larvae (RNAi linked to worker fate)
Either more highly expressed in queen or
worker larvae (based on 2 empirically
Higher in queen adults
Primarily more highly expressed in workers,
but some more highly expressed in queens
(diverse tissue-dependent expression
0/1 high-CpG (37)
Hexamerin familyStorage of amino acids for use in
metamorphosis or by adults
Yellow/major royal jelly protein family
Sex-specific reproductive maturity among
www.pnas.org?cgi?doi?10.1073?pnas.0900301106 Elango et al.
GO Biological Process Term Enrichment. BecauseGOannotation(31)islimitedin
analysis. To identify orthologous proteins between A. mellifera and D. melano-
gaster, Refseq RNA nucleotide accessions for A. mellifera sequences were con-
Center for Biotechnology Information (NCBI) ftp site (http://www.ncbi.nlm.nih.
distance algorithm. A divergence threshold of 0.8 and a BLAST E-value cutoff of
1e-10 were used for ortholog identification. A total of 4,312 orthologous gene
pairs between A. mellifera and D. melanogaster were obtained for further
GO biological process term enrichment was determined by comparing or-
of both low-CpG and high-CpG orthologs using the DAVID bioinformatics data-
base functional annotation tool (57). A Benjamini multiple-testing correction of
significance of gene enrichment (58).
Differential Gene Expression Between Honeybee Queen and Worker Castes.
Differential gene expression in brains of A. mellifera adult queens and sterile
A list of BAGEL normalized expression levels (59) and P values for expression
differences between queens and sterile workers was obtained from C.M. Groz-
inger. Gene identifiers for microarray data were converted to RNA nucleotide
(http://www.ncbi.nlm.nih.gov/ftp/; Dataset S2).
data and C.M. Grozinger for readily corresponding and providing data from
previous research. This study was supported by an Alfred P. Sloan Research
Fellowship (to S.V.Y.) and National Science Foundation Grant DEB 0640690 (to
M.A.D.G. and S.V.Y.).
Biochem Sci 31:89–97.
2. Suzuki MM, Bird A (2008) DNA methylation landscapes: Provocative insights from
epigenomics. Nat Rev Genet 9:465–476.
3. Hendrich B, Tweedie S (2003) The methyl-CpG binding domain and the evolving role
of DNA methylation in animals. Trends Genet 19:269–277.
4. Field LM, Lyko F, Mandrioll M, Pranter G (2004) DNA methylation in insects. Insect Mol
5. Schaefer M, Lyko F (2007) DNA methylation with a sting: An active DNA methylation
system in the honeybee. BioEssays 29:208–211.
6. Wang Y, et al. (2006) Functional CpG methylation system in a social insect. Science
7. Strassmann JE, Queller DC (2007) Insect societies as divided organisms: The complex-
ities of purpose and cross-purpose. Proc Natl Acad Sci USA 314:645–647.
8. Wilson EO (1971) The Insect Societies (Harvard Univ Press, Cambridge, MA).
10. Keller L (1999) Levels of Selection in Evolution (Princeton Univ Press, Princeton, NJ).
11. Maynard Smith J, Szathmary E (1998) The Major Transitions in Evolution (Oxford Univ
12. Kucharski R, Maleszka J, Foret S, Maleszka R (2008) Nutritional control of reproductive
status in honeybees via DNA methylation. Science 319:1827–1830.
13. Maleszka R (2008) Epigenetic integration of environmental and genomic signals in
honey bees. Epigenetics 3:188–192.
14. Moczek AP, Snell-Rood EC (2008) The basis of bee-ing different: The role of gene
silencing in plasticity. Evol Dev 10:511–513.
15. Goodisman MAD, Kovacs JL, Hunt BH (2008) Functional genetics and genomics in ants
(Hymenoptera: Formicidae): The interplay of genes and social life. Myrmecol News
16. Robinson GE, Grozinger CM, Whitfield CW (2005) Sociogenomics: Social life in molec-
ular terms. Nat Rev Genet 6:257–271.
origins exhibit contrasting patterns of regional substitution rate variation. PLoS Com-
put Biol 4:e1000015.
18. Saxonov S, Berg P, Brutlag DL (2006) A genome-wide analysis of CpG dinucleotides in
the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci
19. Suzuki MM, Kerr ARW, De Sousa D, Bird A (2007) CpG methylation is targeted to
transcription units in an invertebrate genome. Genome Res 17:625–631.
20. Weber M, et al. (2007) Distribution, silencing potential and evolutionary impact of
promoter DNA methylation in the human genome. Nat Genet 39:457–466.
22. Elango N, Yi S (2008) DNA methylation and structural and functional bimodality of
vertebrate promoters. Mol Biol Evol 25:1602–1608.
23. Urieli-Shoval S, Gruenbaum Y, Sedat J, Razin A (1982) The absence of detectable
methylated bases in Drosophila melanogaster DNA. FEBS Lett 146:148–152.
the genome of the honeybee, Apis mellifera. Nature 443:931–949.
differential usage of codons and amino acids in GC-poor and GC-rich regions of the
genome of Apis mellifera. Mol Biol Evol 24:611–619.
26. Duret L, Galtier N (2000) The covariation between TpA deficiency, CpG deficiency, and
G?C content of human isochores is due to a mathematical artifact. Mol Biol Evol
27. Fryxell KJ, Zuckerkandl E (2000) Cytosine deamination plays a primary role in the
evolution of mammalian isochores. Mol Biol Evol 17:1371–1383.
28. Razin A, Riggs AD (1980) DNA methylation and gene function. Science 210:604–610.
29. Fryxell KJ, Moon W-J (2005) CpG mutation rates in the human genome are highly
dependent on local GC content. Mol Biol Evol 22:650–658.
30. Simmen MW, et al. (1999) Nonmethylated transposable elements and methylated
genes in a chordate genome. Science 283:1164–1167.
31. Ashburner M, et al. (2000) Gene Ontology: Tool for the unification of biology. Nat
32. Grozinger CM, Fan YL, Hoover SER, Winston ML (2007) Genome-wide analysis reveals
differences in brain gene expression patterns associated with caste and reproductive
status in honey bees (Apis mellifera). Mol Ecol 16:4837–4848.
33. Corona M, Estrada E, Zurita M (1999) Differential expression of mitochondrial genes
between queens and workers during caste determination in the honey bee, Apis
mellifera. J Exp Biol 202:929–938.
34. Corona M, et al. (2007) Vitellogenin, juvenile hormone, insulin signaling, and queen
honey bee longevity. Proc Natl Acad Sci USA 104:7128–7133.
35. Drapeau MD, Albert S, Kucharski R, Prusko C, Maleszka R (2006) Evolution of the
yellow/major royal jelly protein family and the emergence of social behavior in honey
bees. Genome Res 16:1385–1394.
and workers in the honey bee, Apis mellifera. Proc Natl Acad Sci USA 96:5575–5580.
37. Patel A, et al. (2007) The making of a queen: TOR pathway is a key player in diphenic
caste development. PLoS One 2:e509.
38. Wheeler DE, Buck N, Evans JD (2006) Expression of insulin pathway genes during the
period of caste determination in the honey bee, Apis mellifera. Insect Mol Biol
39. Jones PA, Takai D (2001) The role of DNA methylation in mammalian epigenetics.
40. Li E, Beard C, Jaenisch R (1993) Role for DNA methylation in genomic imprinting.
41. Haig D (1992) Intragenomic conflict and the evolution of eusociality. J Theor Biol
42. Haig D (2000) The kinship theory of genomic imprinting. Annu Rev Ecol Syst 31:9–32.
43. Queller D (2003) Theory of genomic imprinting conflict in social insects. BMC Evol Biol
44. Kronforst MR, Gilley DC, Strassmann JE, Queller DC (2008) DNA methylation is wide-
spread across social hymenoptera. Curr Biol 18:R287–R288.
45. Tweedie S, Charlton J, Clark V, Bird A (1997) Methylation of genomes and genes at the
invertebrate-vertebrate boundary. Mol Cell Biol 17:1469–1475.
46. Cristino ADS, et al. (2006) Caste development and reproduction: A genome-wide
analysis of hallmarks of insect eusociality. Insect Mol Biol 15:703–714.
Genome Biol 2:1.
48. Wolschin F, Amdam G (2007) Comparative proteomics reveal characteristics of life-
history transitions in a social insect. Proteome Sci 5:10.
49. Sinha S, Ling X, Whitfield CW, Zhai C, Robinson GE (2006) Genome scan for cis-
regulatory DNA motifs associated with social behavior in honey bees. Proc Natl Acad
Sci USA 103:16352–16357.
50. Illingworth R, et al. (2008) A novel CpG island set identifies tissue-specific methylation
at developmental gene loci. PLoS Biol 6:e22.
51. Meissner A, et al. (2008) Genome-scale DNA methylation maps of pluripotent and
differentiated cells. Nature 454:766–771.
52. Bird A (2002) DNA methylation patterns and epigenetic memory. Genes Dev 16:6–21.
53. Li E (2002) Chromatin modification and epigenetic reprogramming in mammalian
development. Nat Rev Genet 3:662–673.
54. Hughes WOH, Oldroyd BP, Beekman M, Ratneiks FLW (2008) Ancestral monogamy
shows kin selection is key to the evolution of eusociality. Science 320:1213–1216.
division of labour in insect societies. Nat Rev Genet 9:735–748.
56. DeLuca TF, et al. (2006) Roundup: A multi-genome repository of orthologs and evo-
lutionary distances. Bioinformatics 22:2044–2046.
57. Dennis G, et al. (2003) DAVID: Database for annotation, visualization, and integrated
discovery. Genome Biol 4:R60.
58. Hosack D, Dennis G, Sherman B, Lane H, Lempicki R (2003) Identifying biological
themes within lists of genes with EASE. Genome Biol 4:R70.
59. Townsend JP, Hartl DL (2002) Bayesian analysis of gene expression levels: Statistical
quantification of relative mRNA level across multiple strains or treatments. Genome
Elango et al.PNAS ?
July 7, 2009 ?
vol. 106 ?
no. 27 ?