, 1768 (2006);
et al.Maureen L. Coleman,
Genomic Islands and the Ecology and Evolution of
www.sciencemag.org (this information is current as of April 30, 2007 ):
The following resources related to this article are available online at
version of this article at:
including high-resolution figures, can be found in the online
Updated information and services,
can be found at:
Supporting Online Material
, 13 of which can be accessed for free:
cites 22 articles
related to this article
A list of selected additional articles on the Science Web sites
18 article(s) on the ISI Web of Science.
This article has been
6 articles hosted by HighWire Press; see:
This article has been
This article appears in the following
in whole or in part can be found at:
permission to reproduce
of this article or about obtaining
Information about obtaining
registered trademark of AAAS.
c 2006 by the American Association for the Advancement of Science; all rights reserved. The title SCIENCE is a
Copyright American Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
Science (print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by the
on April 30, 2007
Genomic Islands and the Ecology and
Evolution of Prochlorococcus
Maureen L. Coleman,1Matthew B. Sullivan,1Adam C. Martiny,1Claudia Steglich,1*
Kerrie Barry,2Edward F. DeLong,1Sallie W. Chisholm1†
Prochlorococcus ecotypes are a useful system for exploring the origin and function of diversity
among closely related microbes. The genetic variability between phenotypically distinct strains
that differ by less that 1% in 16S ribosomal RNA sequences occurs mostly in genomic islands.
Island genes appear to have been acquired in part by phage-mediated lateral gene transfer, and
some are differentially expressed under light and nutrient stress. Furthermore, genome fragments
directly recovered from ocean ecosystems indicate that these islands are variable among co-
occurring Prochlorococcus cells. Genomic islands in this free-living photoautotroph share features
with pathogenicity islands of parasitic bacteria, suggesting a general mechanism for niche
differentiation in microbial species.
have been described in a few model heterotro-
phic microbes (3), little is known about genomic
microdiversity in the microbial phototrophs that
dominate aquatic ecosystems. The marine cya-
nobacterium Prochlorococcus offers a useful
system for studying this issue, because they are
globally abundant, have very simple growth re-
quirements, have a very compact genome E1.7
to 2.4 megabases (Mb)^, and live in a well-mixed
habitat. Although the latter appears to offer few
opportunities for niche differentiation, Prochlo-
rococcus populations consist of multiple coex-
isting ecotypes (4), whose relative abundances
vary markedly along gradients of light, temper-
ature, and nutrients (5–9). Even two high-light
adapted (HL) ecotypes, whose type strains (MED4
and MIT9312) differ by only 0.8% in 16S ribo-
somal RNA (rRNA) sequence, have substantial-
ly different distributions in the wild (5–9).
Although whole-genome comparisons be-
tween the most distantly related Prochlorococ-
cus isolates (97.9% 16S rRNA identity) have
revealed the gross signatures of this niche differ-
entiation (10), important insights into the evolution
of diversity in this group likely lie in comparisons
we compared the complete genomes of the type
strains, MED4 and MIT9312, that represent the
two HL clades, and we analyzed genome frag-
ments from wild cells belonging to these clades
from the Atlantic and Pacific oceans.
The 1574 shared genes of MED4 and
MIT9312 have conserved order and orientation,
losely related bacterial isolates often con-
tain remarkable genomic diversity (1, 2).
Although its functional consequences
except for a large inversion around the repli-
cation terminus (Fig. 1). The average G þ C
content is similar in both genomes (31%), and
the median sequence identity of the shared
genes is 78%, surprisingly low for strains so
similar at the rRNA locus (11). For most genes,
synonymous sites are saturated and protein se-
quence identity is low (median 80%); this is
likely a function of high mutation rates, given
that HL Prochlorococcus lack several impor-
tant DNA-repair enzymes (10, 12).
The strain-specific genes between MED4
and MIT9312 (236 in MIT9312 and 139 in
MED4) occur primarily (80 and 74%, respec-
tively) in five major islands (Fig. 1). Thus, these
genomes have a mosaic structure similar to that
of Escherichia coli genomes (1), though on a
smaller scale. The islands are located in the
same position in both genomes, implying that
they are hotspots for recombination, and the
length of island genes is similar to the whole-
genome average, suggesting that they are not
degraded. We hypothesize that these islands
arose via lateral gene transfer and continually un-
dergo rearrangement, on the basis of a number of
characteristics. First, three islands are associated
with tRNA genes (fig. S1), which are common
integration sites for mobile elements (13). Sec-
ond, the 3¶ end of tRNA-proline, which flanks
ISL3 in both genomes, is repeated 13 times in
MIT9312-ISL3 (Fig. 2A) and three times in
MED4-ISL3 (fig. S2), suggesting repeated
remodeling of this island. Third, some of the
genes found in a particular island in MED4 are
found in a different island in MIT9312 (Fig. 1), a
rearrangement that may have been mediated by a
48–base pair sequence element we call PRE1
(Prochlorococcus repeat element 1; fig. S3);
portions of PRE1 are repeated, almost exclu-
sively in islands, 13 times in MED4 (fig. S2), and
9 times in MIT9312 (Fig. 2A). Finally, up to 80%
of the genes in any given MIT9312 island are
most similar to the genes of noncyanobacterial
organisms including phage, Eukarya, and Archaea,
consistent with the recent observation that hor-
izontally acquired genomic islands reflect a
gene pool that differs from that of the core ge-
It is likely that phage, which often carry
host genes (15, 16), mediate some of the island-
associated lateral gene transfer, and the hli gene
family in particular appears to have undergone
repeated phage-host gene exchange (16). Of the
24 hli genes in MIT9312, 18 are found in the
five major islands or their flanking regions. All
18 belong to the multicopy and sporadically
distributed group that includes phage copies
(Fig. 2A) and is well differentiated from wide-
spread single-copy hli genes found in cyano-
bacteria (16). Other phagelike genes in islands
include an integrase, DNA methylases, a second
phoH, a MarR-family transcriptional regulator, a
putative hemagglutinin neuraminidase, and an
endonuclease (15), further supporting a link be-
tween phage and island dynamics.
Many island genes in the two strains ap-
pear to encode functions related to physiolog-
ical stress and nutrient uptake and thus may be
important in the high-light, low-nutrient surface
waters dominated by HL Prochlorococcus.
ISL2 and ISL5 in MIT9312, for example, en-
code 12 of the 24 hli genes, known to be im-
portant under a variety of stress conditions (17);
they also encode two outer-membrane transport
1Massachusetts Institute of Technology, Department of Civil
and Environmental Engineering, 15 Vassar Street, Cam-
bridge, MA 02139, USA.2U.S. Department of Energy Joint
Genome Institute, Production Genomics Facility, Walnut
Creek, CA 94598, USA.
*Present address: University Freiburg, Department of Biology
II/Experimental Bioinformatics, Scha ¨nzlestrasse 1, D-79104
†To whom correspondence should be addressed. E-mail:
Fig. 1. Whole-genome alignment
showing the positions of orthologous
genes in MED4 and MIT9312.
Strain-specific genes appear on the
axes. The locations of five major is-
lands defined by whole-genome
alignment (25) are shaded.
ISL1ISL2 ISL3 ISL4 ISL5
24 MARCH 2006VOL 311SCIENCEwww.sciencemag.org
on April 30, 2007
proteins; and a cyanophage-like homolog of
phoH thought to be involved in the phosphate
stress response (15). ISL3 in this strain contains
a paralog of psbF, which encodes part of
cytochrome b559, thought to protect against
photoinhibition (18). Islands also contain genes
involved in nutrient assimilation, including a
cyanate transporter and lyase in MED4 and two
transporters, for manganese/iron and amino
acids, in MIT9312 (fig. S1).
In addition to genes involved in potentially
growth-limiting processes, islands also contain
genes that could play a role in selective mortal-
ity. ISL4 in both MED4 and MIT9312 encodes
proteins involved in cell surface modification,
including biosynthesis of lipopolysaccharide, a
common phage receptor (19) (fig. S1). Phages
are important agents of mortality in the oceans
(20), and thus cell surface properties are likely
under strong selection.
Clearly, for island genes to influence a cell_s
fitness, they must be expressed. When MED4
cells are starved for phosphorus, nine ISL5
genes are differentially expressed, nearly all of
unknown function (table S1). When cells are
shifted to high light, 38 island genes are differ-
entially expressed, including seven hli genes
(table S1) that in Synechocystis encode proteins
that accumulate when cells absorb excess ex-
citation energy (e.g., under high light, nutri-
ent limitation, and low temperatures) (17).
Thus, 26% of all MED4 island genes are dif-
ferentially expressed under P starvation or
high-light stress; only one of these is differ-
entially expressed under both conditions (con-
served hypothetical gene PMM1416), suggesting
that island genes contribute to specific stress
The genome variation within the eMIT9312
clade Esensu (7)^ was examined in wild pop-
ulations of Prochlorococcus by aligning short
genome fragments from the Sargasso Sea (21),
where this clade dominates (7), against the
MIT9312 genome (Fig. 2B). Nearly constant
coverage was observed, confirming a stable
core genome, except for notable gaps at ISL1,
ISL3, and ISL4. This finding indicates that
very few wild sequences match genes in these
islands, and it supports the hypothesis that
these regions are hypervariable in HL Pro-
chlorococcus genomes. In contrast, genes be-
longing to ISL2 and ISL5 are relatively well
represented in the Sargasso Sea data set (Fig.
2B, fig. S2). In MED4 and MIT9312, these
islands contain about half of the hli genes,
lack the tRNA genes implicated in integration
of mobile elements, and contain a smaller frac-
tion of noncyanobacterial genes than do the
other islands. This finding suggests that the
genes in these islands have become fixed in
this wild population.
Examination of 36 large genome fragments
(1.1 Mb total sequence; median size 34 kb)
(table S2) from the Hawaii Ocean Time-Series
Station (22) further confirms that a stable core
most fragments showed remarkable conserva-
MED4 and MIT9312 genomes. Thirty-four of
the 36 fragments were more similar to MIT9312
than to MED4; two contained rRNA operons,
confirming their phylogenetic affiliation with
the eMIT9312 clade (fig. S4). The eMIT9312
fragments have about 90% identity with the
MIT9312 genome and about 80% with MED4
(Table1). Collectively, these results suggest that
the wild eMIT9312 population is a coherent
group identifiable by sequence similarity in the
absence of an rRNA operon (11). eMIT9312
genome fragments from this wild population are
more similar to each other than to the genome
of the type strain MIT9312 (isolated from the
Atlantic Ocean), but still share only 93% aver-
age sequence identity (Table 1), indicating high
coexisting diversity in core genes.
Five eMIT9312 genome fragments from the
Hawaii sample border the major islands defined
above. About 60% of the genes in these islands
have no ortholog in either MED4 or MIT9312,
and two fragments border ISL1, yet their gene
content is largely different from each other and
from the MIT9312 and MED4 genomes (fig.
S5). Indeed, a third of the island genes in these
two fragments are novel, i.e., have no detect-
able homologs, implying that cells have access
to a large novel gene pool in the oceans (14).
Like the islands in the MED4 and MIT9312
genomes, these two fragments contain signa-
tures of mobility, including duplicated tRNA
genes, copies of the repeat PRE1, and an inte-
grase gene. This reveals that islands are dy-
namic even within a single ecotype clade as we
have defined it.
One observation that stimulated this work is
the dramatic difference in distribution and abun-
dance of the two HL Prochlorococcus ecotype
clusters (5–9), as defined by their rRNA internal
transcribed spacer (ITS) sequence similarity.
Although strains belonging to these two clusters
have different island gene content, so do cells
from field populations that belong to a single
cluster. Therefore, other genomic features are
likely to be important in explaining niche dif-
ferentiation between eMED4 and eMIT9312
cells in the wild. Differential temperature
adaptation, for example, which is thought to be
an important determinant of ecotype distribu-
( x 10 bp)
position along MIT9312 genome
y t i t n
y t i t n
a r e
Fig. 2. Features of genomic islands (shaded) in the Prochlorococcus strain MIT9312 genome compared
with wild sequences from the Atlantic and Pacific Oceans. (A) Locations of repetitive elements and hli
genes in MIT9312, shown above or below the horizontal line for the forward or reverse strand,
respectively. hli genes shown in pink belong to the single-copy conserved group and those shown in
blue belong to the multicopy phage-encoded group (16). (B) Percent identity of Sargasso Sea shotgun
database sequences (21) aligned to MIT9312 (top, left axis) and average coverage in the database of a
given position in the MIT9312 genome (bottom, right axis). Log10(coverage) is set to –2 when coverage
equals 0. (C) Genomic locations and percent identity of wild genome fragments (eMIT9312-like unless
noted) aligned to MIT9312.Where the alignment is interrupted, a black line connects aligned segments
of a single fragment. Fragments are projected down to 70% horizontal to visualize total coverage.
Table 1. Median pairwise percent identities, for all orthologous gene pairs and for large aligned
regions 94 kb (25). Numbers in parentheses indicate the number of orthologous gene pairs from
which the median was calculated.
Large aligned regions
78.4 (1574)79.5 (1063)90.6 (1092)93.2 (434)
80.0 (1574)82.4 (1063)92.9 (1092)95.2 (434)
www.sciencemag.orgSCIENCEVOL 31124 MARCH 2006
on April 30, 2007
tions (5), can be achieved through sequence (23) Download full-text
or regulatory (24) changes in the core genome.
Nonetheless, given their prevalence, mobility,
and expression under relevant conditions, is-
lands likely play a role in adaptation, but on
shorter time scales, or more local spatial scales,
in the context of large populations that harbor
substantial genomic variability.
Thus, although streamlined for life in the
oligotrophic oceans, the genomes of HL
Prochlorococcus are not static. Cell-to-cell
genome variability is concentrated in islands
containing genes that are differentially expressed
under stresses typical of oceanic environments.
Just as pathogenicity islands alter the host
specificity and virulence of pathogenic bacteria
(3), genomic islands in Prochlorococcus may
contribute to niche differentiation in the surface
oceans. Although other factors, such as small
insertions and deletions, substitutions in homol-
ogous proteins, and differential regulation are
important contributors to diversity, the preva-
lence of genomic islands and their features argue
that these also play an influential role. We
postulate that lateral gene transfer in genomic
islands is an important mechanism for local
specialization in the oceans. If true, genomic
islands of natural taxa should contain genes that
are ecologically important in a given environ-
ment, regardless of the core genome phylogeny.
Testing this hypothesis will not only advance our
but also contribute to a unified understanding of
genomic evolutionary mechanisms and their
impact on microbial ecology.
References and Notes
1. R. A. Welch et al., Proc. Natl. Acad. Sci. U.S.A. 99, 17020
2. J. R. Thompson et al., Science 307, 1311 (2005).
3. J. Hacker, J. B. Kaper, Annu. Rev. Microbiol. 54, 641 (2000).
4. L. R. Moore, G. Rocap, S. W. Chisholm, Nature 393, 464
5. Z. I. Johnson et al., Science 311, 1737 (2006).
6. E. R. Zinser et al., Appl. Environ. Microbiol. 72, 723 (2006).
7. N. Ahlgren, G. Rocap, S. W. Chisholm, Environ. Microbiol.
8, 441 (2006).
8. N. J. West, D. J. Scanlan, Appl. Environ. Microbiol. 65,
9. N. J. West et al., Microbiology 147, 1731 (2001).
10. G. Rocap et al., Nature 424, 1042 (2003).
11. K. T. Konstantinidis, J. M. Tiedje, Proc. Natl. Acad. Sci.
U.S.A. 102, 2567 (2005).
12. A. Dufresne, L. Garczarek, F. Partensky, Genome Biol. 6,
13. W. D. Reiter, P. Palm, S. Yeats, Nucleic Acids Res. 17,
14. W. W. Hsiao et al., PLoS Genet. 1, e62 (2005).
15. M. B. Sullivan, M. L. Coleman, P. Weigele, F. Rohwer,
S. W. Chisholm, PLoS Biol. 3, e144 (2005).
16. D. Lindell et al., Proc. Natl. Acad. Sci. U.S.A. 101, 11013
17. Q. He, N. Dolganov, O. Bjorkman, A. R. Grossman, J. Biol.
Chem. 276, 306 (2001).
18. D. H. Stewart, G. W. Brudvig, Biochim. Biophys. Acta
1367, 63 (1998).
19. A. Wright, M. McConnell, S. Kanegasaki, in Virus
Receptors, L. L. Randall, L. Philipson, Eds. (Chapman and
Hall, New York, 1980), pp. 27–57.
20. J. A. Fuhrman, Nature 399, 541 (1999).
21. J. C. Venter et al., Science 304, 66 (2004).
22. E. F. DeLong et al., Science 311, 496 (2006).
23. G. N. Somero, Annu. Rev. Physiol. 57, 43 (1995).
24. M. M. Riehle, A. F. Bennett, R. E. Lenski, A. D. Long,
Physiol. Genomics 14, 47 (2003).
25. Materials and methods are available as supporting
material on Science Online.
26. We thank T. Rector, N. Hausman, and R. Steen for
Affymetrix microarray processing; M. Polz for helpful
discussions; and D. Lindell and A. Tolonen for comments
on the manuscript. This work was supported by grants
from NSF Biological Oceanography (S.W.C.) and Micro-
bial Observatory (E.F.D.) Programs, the U.S. Department
of Energy (DOE) GTL Program (to S.W.C. and G. Church),
and the Gordon and Betty Moore Foundation (S.W.C.
and E.F.D.). Sequencing support came from the DOE
Microbial Genomics Program (E.F.D.) and DOE GTL and
Community Sequencing Program (S.W.C.), conducted at
the DOE Joint Genome Institute. Sequences are available
in GenBank: BX548174 (MED4 genome), CP000111
(MIT9312 genome), and DQ366711 to DQ366746
(environmental genome fragments).
Supporting Online Material
Materials and Methods
Figs. S1 to S5
Tables S1 to S3
31 October 2005; accepted 17 February 2006
Toll-Like Receptor Triggering of
a Vitamin D–Mediated Human
Philip T. Liu,1,2* Steffen Stenger,4* Huiying Li,3Linda Wenzel,4Belinda H. Tan,1,2
Stephan R. Krutzik,2Maria Teresa Ochoa,2Ju ¨rgen Schauber,5Kent Wu,1Christoph Meinken,4
Diane L. Kamen,6Manfred Wagner,7Robert Bals,8Andreas Steinmeyer,9Ulrich Zu ¨gel,10
Richard L. Gallo,5David Eisenberg,3Martin Hewison,11Bruce W. Hollis,12John S. Adams,11
Barry R. Bloom,13Robert L. Modlin1,2†
In innate immune responses, activation of Toll-like receptors (TLRs) triggers direct antimicrobial
activity against intracellular bacteria, which in murine, but not human, monocytes and
macrophages is mediated principally by nitric oxide. We report here that TLR activation of human
macrophages up-regulated expression of the vitamin D receptor and the vitamin D-1–hydroxylase
genes, leading to induction of the antimicrobial peptide cathelicidin and killing of intracellular
Mycobacterium tuberculosis. We also observed that sera from African-American individuals, known
to have increased susceptibility to tuberculosis, had low 25-hydroxyvitamin D and were inefficient
in supporting cathelicidin messenger RNA induction. These data support a link between TLRs and
vitamin D–mediated innate immunity and suggest that differences in ability of human populations
to produce vitamin D may contribute to susceptibility to microbial infection.
munity is mediated in part by the Toll family of
pattern-recognition receptors, whose activation
induces expression of a series of antimicrobial
peptides (1). The mammalian TLR homologs, in-
he innate immune system provides a rapid
host mechanism for defense against micro-
bial pathogens. In Drosophila, innate im-
cluding the TLR2 and TLR1 heterodimer (2),
similarly recognize a variety of microbial-derived
ligands, including bacterial lipopeptides. Activa-
tion of TLRs results in a direct antimicrobial
response in monocytes and macrophages in vitro.
In mice, this activity is mediated principally
through generation of nitric oxide (3, 4). How-
ever, we found that TLR2/1-induced antimicrobi-
al activity in human macrophages is not affected
by inhibitors of nitric oxide or reactive oxygen
intermediates (5), and the mechanism of human
microbicidal activity remains unresolved.
In studies of resistance to M. tuberculosis,
we observed that activation of TLR2/1 reduced
the viability of intracellular M. tuberculosis in
human monocytes and macrophages but not in
monocyte-derived dendritic cells (DCs) EFig.
1A and (5, 6)^. Consequently, we used DNA
microarrays to examine gene expression pro-
files of monocytes and DCs stimulated with a
synthetic 19-kD M. tuberculosis–derived lipo-
peptide (TLR2/1L) or treated with medium (6).
A two-way ANOVA was applied to the array
data to identify genes differentially expressed in
the two cell types after TLR2/1L treatment (6).
Genes up-regulated in monocytes, but not in
DCs, with significant P values EP G 0.05; the
false discovery rate (FDR), which is the expected
proportion of false rejections among all rejections,
was 0.09^ were cross-referenced against a list of
genes associated with known antimicrobial func-
tion, yielding two candidates: vitamin D receptor
(VDR) and S100A12, a calcium-binding pro-
inflammatory molecule (7) (Fig. 1B). Although
TLR2/1 stimulation of DCs up-regulated spe-
cific genes characteristic of activation (Fig. 1B),
the selective up-regulation of the VDR gene in
monocytes prompted us to examine further se-
lected VDR-related genes. From these analyses,
24 MARCH 2006VOL 311SCIENCEwww.sciencemag.org
on April 30, 2007