, 647 (2006);
et al. Amanda B. Hummon
From the Genome to the Proteome: Uncovering Peptides in the
This copy is for your personal, non-commercial use only.
clicking here. colleagues, clients, or customers by
, you can order high-quality copies for your
If you wish to distribute this article to others
The following resources related to this article are available online at
here. following the guidelines
can be obtained by
Permission to republish or repurpose articles or portions of articles
Updated information and services,
): May 20, 2014 www.sciencemag.org (this information is current as of
version of this article at:
including high-resolution figures, can be found in the online
can be found at:
Supporting Online Material
related to this article
A list of selected additional articles on the Science Web sites
, 7 of which can be accessed free:
cites 18 articles
66 article(s) on the ISI Web of Science
This article has been
12 articles hosted by HighWire Press; see:
This article has been
This article appears in the following
registered trademark of AAAS.
is aScience 2006 by the American Association for the Advancement of Science; all rights reserved. The title
Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience
on May 20, 2014
on May 20, 2014
on May 20, 2014
on May 20, 2014
the honey bee than in vertebrates, arguing against
DNA methylation as a global mediator of bee
heterochromatin. Third, honey bees possess two
paralogs for methylation maintenance, making
somatic DNMT1 proteins. Fourth, all detected
methylation was limited predominantly to the
coding regions of genes. It remains to be seen
how any of these differences relate to the functions
of methylation in social contexts.
References and Notes
1. M. G. Goll, T. H. Bestor, Annu. Rev. Biochem. 74, 481 (2005).
2. D. C. Queller, BMC Evol. Biol. 3, 15 (2003).
3. D. Haig, Annu. Rev. Ecol. Syst. 31, 9 (2000).
4. L. M. Field, Biochem. J. 349, 863 (2000).
5. S. Tweedie, J. Charlton, V. Clark, A. Bird, Mol. Cell. Biol.
17, 1469 (1997).
6. L. M. Field, F. Lyko, M. Mandrioli, G. Prantera, Insect Mol.
Biol. 13, 109 (2004).
7. J. Marhold, K. Kramer, E. Kremmer, F. Lyko, Development
131, 6033 (2004).
8. N. Kunert, J. Marhold, J. Stanke, D. Stach, F. Lyko,
Development 130, 5083 (2003).
9. M. G. Goll et al., Science 311, 395 (2006).
10. Honey Bee Genome Sequencing Consortium, Nature, in press.
11. B. H. Ramsahoye, Methods Mol. Biol. 200, 17 (2002).
12. J. Frigola, M. Ribas, R. A. Risques, M. A. Peinado, Nucleic
Acids Res. 30, e28 (2002).
13. R. A. Silverstein, K. Ekwall, Curr. Genet. 47, 1 (2005).
14. R. J. Klose, A. P. Bird, Trends Biochem. Sci. 31, 89
15. J. Marhold, A. Brehm, K. Kramer, BMC Mol. Biol. 5, 20
16. P. A. Wade et al., Nat. Genet. 23, 62 (1999).
17. A. T. Hark et al., Nature 405, 486 (2000).
18. I. W. Duncan, Annu. Rev. Genet. 36, 521 (2002).
19. K. Sahara, F. Marec, W. Traut, Chromosome Res. 7, 449
20. We thank J. Maleszka, T. Newman, K. Pruiett, and
C. J. Schoenherr for technical assistance and D. C. Queller,
C. J. Schoenherr, and members of the Jones and Robinson
labs for reviewing the manuscript. Supported by the
Illinois Sociogenomics Initiative (G.E.R.) and the Roy J.
Carver Charitable Trust (C.A.M.), and a Consolider Ingenio
2010 program (CSD2006-49 from the Spanish Ministry of
Education and Science (M.J. and M.A.P.). Sequences have
been deposited into GenBank: AmDnmt1a (GB15130),
AmDnmt1b (GB19865), AmDnmt2 (GB10767), AmDnmt3
(GB14232), and AmMBD (REFSEQ XP_392422). The
authors declare no competing financial interests.
Supporting Online Material
Materials and Methods
Figs. S1 to S8
15 September 2006; accepted 3 October 2006
From the Genome to the Proteome:
Uncovering Peptides in the Apis Brain
Amanda B. Hummon,1Timothy A. Richmond,1Peter Verleyen,2Geert Baggerman,2
Jurgen Huybrechts,2Michael A. Ewing,1Evy Vierstraete,2Sandra L. Rodriguez-Zas,3,4,5*
Liliane Schoofs,2* Gene E. Robinson,4,5,6,7* Jonathan V. Sweedler1,4,5,7*
Neuropeptides, critical brain peptides that modulate animal behavior by affecting the activity of
almost every neuronal circuit, are inherently difficult to predict directly from a nascent genome
sequence because of extensive posttranslational processing. The combination of bioinformatics and
proteomics allows unprecedented neuropeptide discovery from an unannotated genome. Within the
Apis mellifera genome, we have inferred more than 200 neuropeptides and have confirmed the
sequences of 100 peptides. This study lays the groundwork for future molecular studies of Apis
neuropeptides with the identification of 36 genes, 33 of which were previously unreported.
by the extensive biochemical characterization that
complemented the annotation processes for the
Drosophila melanogaster and Caenorhabditis
elegans genomes. This lack of biochemical infor-
mation is particularly problematic when annotating
neuropeptide genes, because neuropeptide protein
precursors undergo extensive posttranslational pro-
cessing before producing final neuropeptides. This
makes the determination of the mature bioactive
lished methodology that permits rapid identifica-
tion of the final neuropeptides from a nascent
Most efforts to elucidate neuropeptides from
newly sequenced genomes have relied on homol-
as well as other current and planned se-
quencing projects,will notbe accompanied
ogy searches to determine prohormone precursors,
with follow-up biochemical studies to confirm the
putative peptides. In D. melanogaster and C.
the neuropeptides of thesetwo organisms had been
performed before sequencing, providing enormous
advantages when annotation of their genomes
began (2). In 2002, through homology searches
against the then–newly sequenced Anopheles
gambiae genome, multiple neuropeptide precursor
mosquito peptides were inferred (3). However,
confirmation of these predictionsandthe discovery
of the neuropeptides themselves have been left to
be performed. In the D. melanogaster genome se-
quence, there is evidence for at least 31 neuro-
predicted from the A. gambiae genome sequence
(3). Prior studies of Apis neuropeptides reported
with mass spectrometry (MS) (7, 10); as shown
in Table 1, the 36 neuropeptide-related genes
1Department of Chemistry,
4Institute for Genomic Biology,
6Department of Entomology, University of Illinois, Urbana, IL
Genomics and Proteomics, Katholieke Universiteit Leuven,
Naamsestraat 59, Louvain B-3000, Belgium.
Institute, University of Illinois, Urbana, IL 61801, USA.
*To whom correspondence should be addressed: jsweedle@
3Department of Animal Sciences,
2Laboratory of Developmental Physiology,
Fig. 1. Mass spectrometric sequence coverage for the tachykinin, allatostatin, and neuropeptide-like
protein 1 prohormones. Sequences underlined indicate peptides that have been detected (and frequently
sequenced) by mass spectrometry. The signal sequences for each prohormone are italicized.
VOL 31427 OCTOBER 2006
reported here are similar in number to those for
the best-annotated animal genomes.
We combined homology and codon-scanning
searches, together with techniques to confirm
gene expression, in an iterative feedback loop to
infer and to verify neuropeptide genes and their
final peptide products (11). With this approach,
a novel gene is annotated, the bioactive peptides
predicted by a statistical algorithm developed in
our laboratory (12), and the peptides confirmed
through MS analysis of Apis brain samples. Al-
ternatively, the characterization of a novel pep-
tide by MS-driven de novo sequencing leads to
targeted searches of the genome, followed by
annotation of a new gene, which then is ex-
amined to predict additional novel peptides.
We discovered 13 unknown putative neuro-
peptides through de novo sequencing (table S1),
which led to the identification of 11 genes that
often encoded additional peptides. The original
peptide, as well as the additional peptides dis-
covered through identification of the precur-
sor, frequently exhibited structural hallmarks
of bioactivity—such as C-terminal amidation.
In one instance, two intense signals correspond-
ing to the closely eluting peptides, VPIYQEPRF
and NVPIYQEPRF, were determined through
MS sequencing of brain samples (13). When
NVPIYQEPRF was used to probe the Apis
genome, a gene was discovered that encodes
this peptide. The peptide in the precursor
was flanked by commonly used proteolytic
cleavage sites. Additional peptides were pre-
dicted from the sequence by our statistically
based prohormone-processing algorithm (12),
and two of them (GYPYQHRLVY and
were subsequently identified with tandem MS
(MS/MS) sequencing information, which con-
firmed their identities.
The transcript of this previously unanno-
tated gene had been analyzed in microarray
studies and was found to be highly correlated
with behavioral plasticity in the honey bee
(14). No homologs for the protein precursor
were found in any of the sequenced eukary-
otic genomes (11), although a section of the
peptide had similarity to a portion of the
b-tubulin protein from the fungi Amanita sinensis
for A. sinensis and SSKSRGYPYQHR for A.
0.047 and 0.063].
Three precursors, discovered through de novo
sequencing of peptides in Apis brain samples, and
their corresponding peptides, LRNQLDIGDLQ,
TWKSPDIVIRFamide, have no similarity to
other known protein sequences or to any trans-
lated genome (11). In the case of the precursor
encoding TWKSPDIVIRFamide, three additional
predicted peptides have been identified, including
one, GRNDLNFIRYamide, with a C-terminal
Another peptide de novo sequenced from
Apis brain samples is ITGQGNRIF. Analysis of
the Apis genome revealed a precursor encoding
the peptide near the C terminus of the protein,
situated between proteolytic cleavage sites. Pro-
teins displaying similarities to this Apis peptide
were present in A. gambiae and D. melanogaster
[A. gambiae sequence: ENSANGP00000017235
(Ensembl gene database), E-value, 2E – 50; D.
melanogaster: CG8216-PA, E-value, 5E – 16].
There is no predicted function for the A. gambiae
has a suggested role in DNA binding, as it is
putatively involved in DNA transposition (Fly-
Base reportCG8216).Most interesting,though,is
that these putative proteins do not contain the
ITGQGNRIF peptide detected in honey bee
De novo MS sequencing with the support of
the Apis genome also led to the discovery of mul-
tiple neuropeptides similar to other known insect
neuropeptides (Table 1 and table S1). In total, 100
MS (tables S2 and S4), 87 of which had MS/MS
sequence-confirming data. For most of these
precursors, multiple peptides were characterized.
In the cases of allatostatin, neuropeptide-like
sequenced or confirmed by mass matches, which
resulted in high sequence coverage for the
peptides with other functions (11).
However, because of variations that may be
caused by organismal life cycles, analytical tech-
niques, and/or sample sizes, biochemical-based
some categories of neuropeptides. For example,
eclosion hormones are expressed during short
temporal windows and may not be detectable by
Table 1. Apis mellifera neuropeptide genes.
Adipokinetic hormone (AKH)
Apidaecin 1 and 2
Calcitonin-like diuretic hormone (DH31)
Crustacean cardioactive peptide (CCAP)
Crustacean hyperglycemic hormone (ITP)
Diuretic hormone (DH)
Ecdysis-triggering hormone (ETH)
Eclosion hormone (EH)
Neuropeptide F (NPF)
Neuropeptide FF (NPFF)–like
Neuropeptide-like protein 1 (NPLP-1)
Neuropeptide-like protein 2 (NPLP-2)
Neuropeptide-like protein 3 (NPLP-3)
Pigment-dispersing hormone (PDH)
Short neuropeptide F (sNPF)
27 OCTOBER 2006 VOL 314
MS unless the timing of sample acquisition is Download full-text
matched to those windows. In cases where it was
unlikely that the peptides would be detectable by
MS, we characterized the proteins by bioinfor-
matics approaches using the genomic data. Some
of the well-known insect neuropeptide precursors
that are predicted from the genome using
homology-based searches include crustacean car-
dioactive peptide, crustacean hyperglycemic hor-
mone, eclosion hormone, and insulin.
Because of the repetitive quality of several
neuropeptide precursors, we developed a codon-
scanning algorithm that searches for repetitive
sequences in the genome. When multiple pep-
tides are generated from a single precursor, they
frequently share a repeating C-terminal amino
acid pattern (for example, the FGLamide motif
present in allatostatin peptides). Algorithms
tailored to recognize repeating motifs can more
successfully identify neuropeptide genes than
traditional homology-based approaches. In an
analysis of the C. elegans genome for RFamide-
coding genes, two putative genes were identi-
fied by homology searches, a nominal number
when compared with the 29 potential genes
discovered with a tailored, pattern-matching
algorithm (15). We used a similar approach to
identify potential precursors without homologs in
other species. As shown in Table 1, this method
was also used to verify the positions in the ge-
nome of several of the repeating precursors, for
example, allatostatin, tachykinin, and pheromone
As mentioned above, one neuropeptide
gene family containing repeating C termini, the
RFamides, is a well-studied family of myotropic
ranging from mollusks to mammals (16–20). In
mammals, two of the RFamides, neuropeptides
FF (NPFF) and AF (NPAF), are produced from
NPFF precursors and end in a C-terminal motif,
QRFamide (21). We probed the Apis genome
using the codon-scanning algorithm looking for
an open reading frame that encodes a motif with
at least two RFamides located within 1 kilobase
of each other. We discovered a putative gene
that encoded a signaling protein containing
three peptidesterminating in QRFamide. In this
particular case, the codon-scanning algorithm
sion of this putative Apis NPFF-like gene was
verified by quantitative reverse transcription–
polymerase chain reaction (qRT-PCR) (table
neuropeptide genes—FLRFamide, RFamide1,
and RFamide2. Because these precursors do not
display significant similarity to RFamide pre-
cursors from other insects, they are not iden-
tifiable using homology searches.
Our combined approach yields many more
peptides than the individual approaches used
previously. As a result, in a single investigative
effort, a comparable number of neuropeptides are
now known in the honey bee relative to other
well-studied animal models. Microarrays can be
designed to include a greater number of neuro-
peptide gene products, thereby expanding our
understanding of the expression of the neuro-
modulators inherent to the operation of neuronal
networks. The potential of our blended technol-
ogy approach to facilitate discovery of these
peptides is not only significant for advancing
honey bee research, it demonstrates promise for
neuropeptide discovery in the large number of
other new genomes currently being sequenced.
References and Notes
1. Honey Bee Genome Sequencing Consortium, Nature
443, 931 (2006).
2. PubMed search for D. melanogaster neuropeptides
pre–March 2000, yielded 305 articles; for C. elegans
pre–December 1998, 75 articles.
3. M. A. Riehle, S. F. Garczynski, J. W. Crim, C. A. Hill,
M. R. Brown, Science 298, 172 (2002).
4. R. S. Hewes, P. H. Taghert, Genome Res. 11, 1126 (2001).
5. J. Vanden Broeck, Peptides 22, 241 (2001).
6. Multiple processing products of the genes predicted from
the D. melanogaster genome sequence were later
confirmed with MS, and sequences of interest were
7. H. Takeuchi, A. Yasuda, Y. Yasuda-Kamatani, T. Kubo,
T. Nakajima, Insect Mol. Biol. 12, 291 (2003).
8. P. Verleyen et al., Biochem. Biophys. Res. Commun. 320,
9. P. Verleyen et al., Peptides 27, 493 (2006).
10. N. Audsley, R. J. Weaver, Peptides 27, 512 (2006).
11. Materials and methods are available as supporting
material on Science Online.
12. A. B. Hummon et al., J. Proteome Res. 2, 650 (2003).
13. Single-letter abbreviations for the amino acid residues
are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe;
G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro;
Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
14. C. W. Whitfield, A.-M. Cziko, G. E. Robinson, Science 302,
15. A. N. Nathoo, R. A. Moeller, B. A. Westlund, A. C. Hart,
Proc. Natl. Acad. Sci. U.S.A. 98, 14000 (2001).
16. M. Schaefer et al., Cell 41, 457 (1985).
17. R. Nichols, J. B. McCormick, I. A. Lim, Ann. N. Y. Acad.
Sci. 897, 264 (1999).
18. S. J. Husson, E. Clynen, G. Baggerman, A. De Loof,
L. Schoofs, Biochem. Biophys. Res. Commun. 335, 76
19. G. J. Dockray, Exp. Physiol. 89, 229 (2004).
20. N. Chartrel et al., Proc. Natl. Acad. Sci. U.S.A. 100,
21. E. Bonnard et al., Peptides 22, 1085 (2001).
22. G. Baggerman, A. Cerstiaens, A. De Loof, L. Schoofs,
J. Biol. Chem. 277, 40368 (2002).
23. We dedicate this article to A. De Loof for a full career of
insect research at the K. U. Leuven. We thank T. Newman
for assistance with qRT-PCR, B. Southey for aiding our
BLAST searches, and C. Whitfield for helpful discussions.
We acknowledge M. Corona for annotating the insulin
gene. Supported by the National Institutes of Health
through NS31609, P30 DA01830 (J.V.S.), GM068946
(S.L.R.-Z.), and DC006395 (G.E.R.). P.V., G.B., J.H., and
E.V. are postdoctoral researchers of the FWO-Flanders
(Fund for Scientific Research–Flanders), supported by
Supporting Online Material
Materials and Methods
Tables S1 to S4
21 December 2005; accepted 27 July 2006
Bacterial Taxa That Limit Sulfur
Flux from the Ocean
Erinn C. Howard,1James R. Henriksen,1Alison Buchan,3Chris R. Reisch,1Helmut Bürgmann,2
Rory Welsh,2Wenying Ye,2José M. González,4Kimberly Mace,2Samantha B. Joye,2
Ronald P. Kiene,5,6William B. Whitman,1Mary Ann Moran2*
Flux of dimethylsulfide (DMS) from ocean surface waters is the predominant natural source of sulfur to
the atmosphere and influences climate by aerosol formation. Marine bacterioplankton regulate sulfur
flux by converting the precursor dimethylsulfoniopropionate (DMSP) either to DMS or to sulfur
compounds that are not climatically active. Through the discovery of a glycine cleavage T-family protein
with DMSP methyltransferase activity, marine bacterioplankton in the Roseobacter and SAR11 taxa were
identified as primary mediators of DMSP demethylation to methylmercaptopropionate. One-third of surface
ocean bacteria harbor a DMSP demethylase homolog and thereby route a substantial fraction of global
marine primary production away from DMS formation and into the marine microbial food web.
degradation of DMSP to DMS and subsequent
exchange of DMS across the ocean-atmosphere
boundary is the main natural source of sulfur to
the atmosphere, amounting to ~20 Tg of sulfur
annually (4). DMS-derived atmospheric sulfur
affects cloud formation and the radiative properties
of Earth (5). Phytoplankton are known to degrade
DMSP to DMS, but efforts to predict global
patterns of ocean-atmosphere DMS flux based
solely on phytoplankton parameters have been
for use as an osmolyte (1), predator
deterrent (2), and antioxidant (3). The
unsuccessful (6). Other members of the marine
plankton must therefore influence the production
and emission of DMS from the surface ocean (7).
1Department of Microbiology, University of Georgia, Athens,
GA 30602, USA.2Department of Marine Sciences, University
of Georgia, Athens, GA 30602, USA.
Microbiology, University of Tennessee, Knoxville, TN 37996,
USA.4Department of Microbiology, University of La Laguna,
38071 La Laguna, Tenerife, Spain.5Department of Marine
Sciences, University of South Alabama, Mobile, AL 36688,
USA.6Dauphin Island Sea Lab, Dauphin Island, AL 36528,
*To whom correspondence should be addressed. E-mail:
VOL 31427 OCTOBER 2006