The Complete Genome of Teredinibacter turnerae T7901:
An Intracellular Endosymbiont of Marine Wood-Boring
Joyce C. Yang1, Ramana Madupu2, A. Scott Durkin2, Nathan A. Ekborg1¤, Chandra S. Pedamallu4,
Jessica B. Hostetler2, Diana Radune2, Bradley S. Toms2, Bernard Henrissat5, Pedro M. Coutinho5, Sandra
Schwarz6, Lauren Field4, Amaro E. Trindade-Silva7, Carlos A. G. Soares7, Sherif Elshahawi8, Amro
Hanora9, Eric W. Schmidt10, Margo G. Haygood8, Janos Posfai4, Jack Benner4, Catherine Madinger4, John
Nove1, Brian Anton4, Kshitiz Chaudhary4, Jeremy Foster4, Alex Holman4, Sanjay Kumar4, Philip A
Lessard3¤, Yvette A. Luyten1,4, Barton Slatko4, Nicole Wood1, Bo Wu4, Max Teplitski11, Joseph D.
Mougous6, Naomi Ward12, Jonathan A. Eisen13, Jonathan H. Badger2, Daniel L. Distel1*
1Ocean Genome Legacy, Inc., Ipswich, Massachusetts, United States of America, 2J. Craig Venter Institute, San Diego, California, United States of America,
3Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America, 4New England Biolabs, Ipswich, Massachusetts, United States of America,
5Architecture et Fonction des Macromole ´cules Biologiques, UMR6098, CNRS, Universite ´s Aix-Marseille I & II, Case 932, Marseille, France, 6Department of Microbiology,
University of Washington, Seattle, Washington, United States of America, 7Universidade Federal do Rio de Janeiro, Instituto de Biologia, Ilha do Fundao, CCS, Rio de
Janeiro, Rio de Janeiro, Brazil, 8Department of Environmental and Biomolecular Systems, OGI School of Science & Engineering, Oregon Health & Science University,
Beaverton, Oregon, United States of America, 9Department of Microbiology and Immunology, Faculty of Pharmacy, Suez Canal University, Ismailia, Egypt, 10College of
Pharmacy, University of Utah, Salt Lake City, Utah, United States of America, 11University of Florida, Gainesville, Florida, United States of America, 12Department of
Molecular Biology, University of Wyoming, Laramie, Wyoming, United States of America, 13UC Davis Genome Center, University of California Davis, Davis, California,
United States of America
Here we report the complete genome sequence of Teredinibacter turnerae T7901. T. turnerae is a marine gamma
proteobacterium that occurs as an intracellular endosymbiont in the gills of wood-boring marine bivalves of the family
Teredinidae (shipworms). This species is the sole cultivated member of an endosymbiotic consortium thought to provide
the host with enzymes, including cellulases and nitrogenase, critical for digestion of wood and supplementation of the
host’s nitrogen-deficient diet. T. turnerae is closely related to the free-living marine polysaccharide degrading bacterium
Saccharophagus degradans str. 2–40 and to as yet uncultivated endosymbionts with which it coexists in shipworm cells. Like
S. degradans, the T. turnerae genome encodes a large number of enzymes predicted to be involved in complex
polysaccharide degradation (.100). However, unlike S. degradans, which degrades a broad spectrum (.10 classes) of
complex plant, fungal and algal polysaccharides, T. turnerae primarily encodes enzymes associated with deconstruction of
terrestrial woody plant material. Also unlike S. degradans and many other eubacteria, T. turnerae dedicates a large
proportion of its genome to genes predicted to function in secondary metabolism. Despite its intracellular niche, the T.
turnerae genome lacks many features associated with obligate intracellular existence (e.g. reduced genome size, reduced
%G+C, loss of genes of core metabolism) and displays evidence of adaptations common to free-living bacteria (e.g. defense
against bacteriophage infection). These results suggest that T. turnerae is likely a facultative intracellular ensosymbiont
whose niche presently includes, or recently included, free-living existence. As such, the T. turnerae genome provides insights
into the range of genomic adaptations associated with intracellular endosymbiosis as well as enzymatic mechanisms
relevant to the recycling of plant materials in marine environments and the production of cellulose-derived biofuels.
Citation: Yang JC, Madupu R, Durkin AS, Ekborg NA, Pedamallu CS, et al. (2009) The Complete Genome of Teredinibacter turnerae T7901: An Intracellular
Endosymbiont of Marine Wood-Boring Bivalves (Shipworms). PLoS ONE 4(7): e6085. doi:10.1371/journal.pone.0006085
Editor: Niyaz Ahmed, University of Hyderabad, India
Received January 27, 2009; Accepted May 6, 2009; Published July 1, 2009
Copyright: ? 2009 Yang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: We are grateful for the financial support of New England Biolabs, the National Science Foundation (grant no. 0523862 to D.L.D.) and Oregon
Opportunity Funds from the State of Oregon, NIH (to M.G.H.). Contributions by A. E. Trindade-Silva and C. A. G. Soares were supported in part by the Brazilian
agency CNPq (Grant MCT/CNPq 02/2006 No.: 470967/2006-4) and by the Brazilian fellowship programs of CNPq and PDEE/CAPES. The funders had no role in
study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: email@example.com
¤ Current address: Agrivida, Inc., Medford, Massachusetts, United States of America
PLoS ONE | www.plosone.org1July 2009 | Volume 4 | Issue 7 | e6085
Teredinibacter turnerae is a Gram-negative gamma proteobacter-
ium that has been isolated from the gills of a broad range of wood-
boring marine bivalves of the family Teredinidae (shipworms)
[1,2]. This species has been shown to coexist with other as yet
uncultivated bacteria as a component of an intracellular
endosymbiotic bacterial consortium within specialized cells
(bacteriocytes) of the gill epithelium [3,4]. It displays an unusual
combination of properties, being the only aerobic bacterium
known to grow with cellulose and dinitrogen, respectively, as its
sole carbon and nitrogen sources .
The cellulolytic and diazotrophic capabilities of T. turnerae
suggested two potential roles for this bacterium in the shipworm
symbiosis . The first is to produce enzymes that may assist the
host in degrading carbohydrate components of woody plant
materials (cellulose, hemicellulose, and pectin). Shipworms are the
only marine animals known to grow and reproduce normally with
wood as their sole source of particulate food . The second is to
provide a source of fixed nitrogen to supplement the host’s
nitrogen deficient diet of wood. The latter function of shipworm
symbionts was recently demonstrated experimentally .
The capacity to degrade woody plant materials is of interest
because these materials are extraordinarily abundant in nature
and serve as major reservoirs of carbon and energy . Plant cell
walls are typically composed of a complex composite of cellulose, a
linear homopolymer of beta 1–4 linked glucose residues,
hemicellulose, a decorated heteropolymer of xylose units, and
lignins, heterogeneous polymers of aromatic residues. Pectin, a
heteropolymer containing alpha 1–4 linked galacturonic acid, is
also an important component of plant cell walls. The low solubility
of these compounds, and their tendency to form crystalline arrays
with complex interconnecting networks of ether and ester linkages,
make woody plant materials highly recalcitrant to enzymatic
The complete degradation of woody plant materials requires
numerous enzymes, which in nature are often contributed by
multiple microorganisms acting in concert. Cellulose degradation
requires at least three types of hydrolytic activities: beta-1,4-
glucosidase [E.C. 22.214.171.124], cellobiohydrolase [E.C. 126.96.36.199] and
endoglucanase [E.C. 188.8.131.52] (Figure 1A). The depolymerization
of hemicellulose requires carbohydrases and esterases that serve to
break the xylan backbone and decouple side-chains that may bind
to the lignin components of wood (Figure 1B).
The majority of enzymes known to degrade complex polysac-
charides belong to a diverse functional category called glycoside
hydrolases (GH). GHs are assigned to 112 families (http://afmb.
cnrs-mrs.fr/CAZY/) based on nucleotide or amino acid sequence.
Many of these families are functionally heterogeneous, containing
members that differ in substrate specificities as well as sites and
modes of substrate cleavage. In addition to GH activities,
degradation of complex polysaccharides may also involve activity
of carbohydrate esterases (CE), polysaccharide lyases (PL), and
carbohydrate binding modules (CBMs). Moreover, these activities
are often found within modular proteins that may contain multiple
domains with differing catalytic and substrate binding properties,
separated by non-catalytic linker domains. Thus, the composition
and organization of polysaccharide degrading proteins may be
highly diverse and variable and so exploration of new systems is of
The genome of T. turnerae is also of interest as an example of the
range of adaptations associated with intracellular endosymbionts
of eukaryotes. A characteristic suite of genomic modifications,
including reduced genome size, skewed %G+C, elevated mutation
rates and loss of genes of core metabolism, has been identified
through analysis of genomes of a number of obligate intracellular
symbionts . However, this is not the case for T. turnerae, which
stands as an example of a bacterium that has been observed in
nature only as an endosymbiont, but that can be cultivated in vitro
in a simple defined medium without added vitamins or growth
factors. Its comparison to known obligate endosymbionts may
therefore be informative.
Here we report the complete genome sequence of the T. turnerae
strain T7901 (ATCC 39867) isolated from the shipworm Bankia
gouldi. We examine and discuss the composition and architecture
of the genome of this strain with emphasis on systems pertinent to
symbiosis and free-living existence.
Results and Discussions
Genome features and comparative genomics
General genome features.
turnerae T7901 is a single circular molecule of length 5,192,641 bp
(50.8% G+C) (Table 1, Supporting Information: Figure S1). The
genome encodes 4,690 predicted protein-coding regions. Of these,
3,067 (65.4%) could be assigned a function based on inferred
homology, 1026 (21.9%) are hypothetical proteins (no inferred
homology to any previously identified proteins), 589 (12.2%) are
hypothetical proteins encoded by other genomes), and the
remaining 26 (0.5%) appear to be homologues of experimentally
confirmed proteins of unknown function. The average ORF length
is 973 bp and the average intergenic region is 130 bp. No
extrachromosomal elements were detected.
Similar strains of T. turnerae have
been isolated from the gills of 23 species of teredinid bivalves
representing 9 host genera collected along the coasts of North and
South America, India, Australia and Hawaii [1,2]. All strains have
similar properties and five strains that have been examined by
small subunit (16S) ribosomal rRNA sequence analysis are nearly
identical with respect to this locus (.99.7% identity: Figure 2).
Phylogenetic analyses of 16S rRNA sequences help to identify
closely related bacteria for genomic comparison to T. turnerae
T7901. These indicate that T. turnerae is most closely related to
several as yet uncultivated bacterial endosymbionts that have been
identified in shipworm gill tissues using cultivation-independent
methods (Figure 2) [3,4]. The closest known free-living (and
presumably non-symbiotic) relative is Saccharophagus degradans str.
2–40 , a marine bacterium isolated from decaying sea grass
(Spartina alterniflora) in the Chesapeake Bay watershed. This
bacterium degrades an unusually broad spectrum of plant, algal,
and fungal cell wall components, including .10 classes of complex
polysaccharides. Also included within this clade is ‘‘Candidatus
Endobugula sertula’’ , the as yet uncultivated symbiont of the
bryozoan, Bugula neritina. This symbiont is known to contribute to
the chemical defenses of this host species during larval stages by
[12,13,14,15] that inhibits predation by fish and that is being
considered as a candidate drug for treatment of cancer and
dementia. This evokes an additional potential role for T. turnerae as
a source of secondary metabolites that may contribute to host
defenses or maintenance of the symbiotic association.
Comparative gene content.
sequences were determined for S. degradans str. 2–40  and two
other bacterial strains that are closely related to T. turnerae T7901
based on 16S rRNA sequences. These are Cellvibrio japonicus
Ueda107  and Hahella chejuensis KCTC 2396 . The shared
evolutionary history of these bacterial strains is evidenced by the
The genome of Teredinibacter
Recently, complete genome
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org2 July 2009 | Volume 4 | Issue 7 | e6085
Table 1. Comparison of general genome features.
DNA, total bp5,192,641 5,057,5317,074,8934,576,573
% G+C 5146 63 52
% Coding86.586.788.7 90.5
No. rRNAs96 159
No. ORFs4690 4008 6144 3790
Mean ORF length973 10831007 1092
Figure 1. Enzymatic degradation of common components of woody plant materials. Enzymatic components required for the breakdown
of cellulose (A) and a hypothetical xylan (B) are shown along with the corresponding EC number designations. Enzymes and the corresponding side-
chains cleaved by them are presented in color while substrate backbones and the corresponding enzymes that cleave them are portrayed in black.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org3 July 2009 | Volume 4 | Issue 7 | e6085
large number of homologues inferred among predicted open
reading frames (ORFs) (Figure 3A) and by the number of these
that have best or near best hits among these genomes in total
genome BLAST searches. Of the 3,502 predicted proteins in the
T. turnerae genome that had at least one BLASTP (E,1e-4) hit to a
protein encoded by another genome (Combo, DB, Wu et al.,
unpublished), 1,670 had a best hit to S. degradans 2–40, 265 to C.
japonicus Ueda107, and 85 to H. chejuensis KCTC 2396. All other
genomes had fewer best hits.
The genomes of T. turnerae, S. degradans
organization and display a considerable degree of synteny when
protein-coding regions are aligned (Figure 3B). An unusual shared
similarity is also observed in the organization of ribosomal genes in
T. turnerae and S. degradans. The ribosomal operons of most bacteria
are composed of a 16S rRNA gene (rrs) followed by an internal
transcribed spacer (ITS1), a 23S rRNA gene (rrl), a second internal
transcribed spacer (ITS2) and finally a 5S rRNA gene (rrf). In
addition to ribosomal operons with this canonical organization,
the genomes of T. turnerae and S. degradans each contain a single
occurrence of rrs and rrl genes that are separated by putative
protein coding sequences (Figure 4) rather than a typical ITS1.
The ITS1 region of most known gamma-proteobacteria ranges
from ,250–1,000 nucleotides in length and encodes one or two
tRNA genes but does not contain protein-coding genes . In
contrast, S. degradans contains two putative protein-coding
sequences within this region while the comparable region in T.
turnerae encodes eight.
Figure 2. Phylogenetic relationships among T. turnerae and selected gamma-proteobacteria. A maximum likelihood (ML) tree based on
comparative analysis of 16S rDNA (1,389 aligned characters) inferred using PHYML  as implemented in Geneious 3.8.5 (Biomatters Ltd.) is shown
for 23 related Pseudomonadaceae. Genbank accession numbers are indicated. Not shown but used in the analysis are 16S sequences from Escherichia
coli (AE014075 and U00096) and Salmonella typhimurium (AE006468 and CP000026). Sequences were aligned manually using MacGDE 2.3 [Linton E:
MacGDE: Genetic Data Environment for MacOSX [http://www.msu.edu/,lintone/macgde/] taking into consideration secondary structural information
. Bootstrap proportions greater than 70% are expressed to the left of each node as a percentage of 1,000 replicates. The ML tree topology is
identical to the consensus tree generated with the same alignment using Mr. Bayes 3.0 (not shown) . Bold taxon labels signify that a complete
genome sequence has been determined and is publicly available; asterisks denote symbionts of invertebrates and ‘‘M’’ denotes isolation from marine
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org4 July 2009 | Volume 4 | Issue 7 | e6085
In both genomes, the intervening ORFs most proximal to the rrl
are homologous putative heptosyltransferase genes. Moreover, the
remaining seven ORFs embedded in the ITS1 of T. turnerae also
have homologues elsewhere in the S. degradans genome, five of
which are syntenic in both genomes. This gene organization
strongly suggests that a common ancestral insertion resulted in the
proximity of rrl and heptosyltransferase genes in both genomes and
that this event was likely followed by at least four rearrangement
events (Figure 4) to arrive at the extant gene orders. The large size
of these insertions and alternating orientation of the contained
Figure 3. Genomic comparison of T. turnerae, S. degradans, H. chejuensis and C. japonicus. (A) Venn diagram portraying occurrence of
predicted protein coding genes of homologous origin shared among the genomes of T. turnerae, S. degradans, H. chejuensis and C. japonicus. (B)
Synteny of predicted protein coding genes among the genomes of T. turnerae (x-axis, top and bottom) and S. degradans (y-axis, top) and C. japonicus
(y-axis, bottom) inferred using PROmer (PROtein MUMmer). Circular chromosomes are depicted linearly with the origins of replication at map
coordinates (0,0). Dots depict location of homologous proteins relative to the origins with red and blue representing homology on the same or
opposite strand, respectively.
Figure 4. Anomalous organization of ribosomal operon A (rrnA) in T. turnerae and S. degradans. One of three ribosomal RNA operons
(rrnA) in the T. turnerae genome displays unusual organization. In this operon, the small (rrs) and large (rrl) subunit ribosomal RNA genes are
separated by protein coding genes in both T. turnerae and S. degradans rather than by internal transcribed spacers (ITS) as in most bacteria. All
intervening genes have homologues in both genomes. The common location of homologous heptosyltransferase genes in both genomes suggests
at least one common ancestral insertion. Ribosomal rRNA genes are depicted in blue. Open reading frames (red, green, orange, and light blue) have
been colored to distinguish homologous genes. Dotted lines mark indel boundaries. Chromosomal locus coordinates are indicated to the right of
each depicted genome region and unique locus tag numbers are indicated beneath each locus without the preceding GenBank prefixes (TERTU_ for
T.turnerae or Sde_ for S. degradans, respectively).
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org5 July 2009 | Volume 4 | Issue 7 | e6085
ORFs suggest that these rrs and rrl genes are no longer contained
within a single transcriptional unit in either genome.
Polysaccharide degradation systems of Teredinibacter
Like that of its closest known relative,
S. degradans, the T. turnerae genome is notable for containing an
unusually large number of protein domains involved in the
degradation of complex polysaccharides, including glycoside
hydrolases (GH), carbohydrate esterases (CE), pectin lyases (PL),
and carbohydrate binding modules (CBM). However, in contrast
to S. degradans, which is a generalist capable of degrading more
than 10 types of plant, algal, animal and fungal polysaccharides
[19,20,21], the T. turnerae genome lacks enzyme systems for
degradation of common marine polysaccharides including agar,
alginate, and fucoidan and has only comparatively sparse
representation of chitinase (two vs. seven in S. degradans) and
laminarinase (six vs. ten in S. degradans) genes. Enzymes for
degradation of the fungal polysaccharide pullulan are also absent
in T. turnerae. Instead, the gene content of the T. turnerae genome
suggestsa high degreeof
polysaccharides associated with woody plant materials, including
cellulose, xylan, mannan, galactorhamnan and pectin.
The T. turnerae genome encodes a total of 123 ORFs dedicated
to processing complex polysaccharides. Of these ORFs, 95 encode
GH domains, 4 encode PL domains, 18 encode CE domains, and
4 encode both GH and CE domains (Supporting Information:
Tables S1, S2, and S3). The total number of GH domains in T.
turnerae is similar to that of S. degradans  and C. japonicus ,
which places T. turnerae among the top 5% of over 750 bacteria
with sequenced genomes (BH & PMC, unpublished). Notably, T.
turnerae possesses a high number of CE domains per single genome
among these organisms (see Table 2), suggesting a considerable
investment in capacity for degrading hemicelluloses.
The diversity of GH domain families represented in the T.
turnerae genome is also comparable to those of the other cellulolytic
bacteria. The T. turnerae genome encodes catalytic domains
representing 38 different GH families. This compares to 38 in S.
degradans, 42 in C. japonicus, and 44 in a recently characterized
metagenome derived from a community containing over a
thousand bacterial types in the hindgut of the termite Nasutitermes
sp . (Figure 5).
Although the absolute number and diversity of GH domains is
similar, the proportion of GH domains predicted to have activity
against components of woody plant materials (cellulose, xylan,
mannan, and rhamnogalactans) in the T. turnerae genome (Figure 5)
is 53%, nearly twice that of S. degradans (29%), C. japonicus (29%) or
the Nasutitermes sp. hindgut community (27%). Indeed, seven GH
families (GH6, GH9, GH10, GH11, GH44, GH45, and GH62)
are represented by at least twofold more domains in T. turnerae
than in C. japonicus or S. degradans. Each of these has predicted
activity against cellulose or xylan.
Multidomain and multicatalytic enzymes.
genome is also unusual in the number of multicatalytic enzymes
(single proteins with multiple catalytic domains, each with distinct
predicted catalytic activities) that it encodes and in the domain
composition of these enzymes. While multidomain carbohydrases
are common, most are composed of a single catalytic domain plus
The T. turnerae
Table 2. Summary of carbohydrate binding and catalytic domains found per genome or metagenome.
Teredinibacter. turneraeSaccharophagus degradansCcllvibrio japonicus Nasutitermes Community
Glycoside hydrolases (GHs) 101130122 704
Polysaccharide lyases (PLs)5 33 1410
Carbohydrate esterases (CEs) 2215 19 n/d
Carbohydrate binding modules (CBMs) 117136 93 10
Figure 5. Prevalence of GH domains as a function of substrate specificity. The genome of T. turnerae contains a larger proportion of
glycoside hydrolase (GH) domains with specificity for major wood components (cellulose, xylan, mannans, and rhamnogalactans) than do other
compared genomes and metagenomes. GH domains are sorted by known substrate specificity and presented as a fraction of the total number of GH
domains per genome. Substrate specificities are coded as follows: green=cellulose/xylan, (GH families 5, 6, 8, 9, 10, 11, 12, 44, 45, 51, 52, 62, and 74),
dark green=agarose (GH families 50 and 86), light green=chitin (GH families 18, 19, and 20), light grey=peptidoglycan (GH families 28 and 105),
dark grey=laminarin (GH families 16 and 81), black=pectin (GH families 28 and 105), purple=other woody plant cell wall polysaccharides (GH
families 26, 53, and 67) and blue=GH domains with other specificities or specificities not uniquely predicted by family designation. Proportions are
expressed as a fraction of the total number of GH domains found in the genomes of Teredinibacter turnerae (101), Saccharophagus degradans (130),
Cellvibrio japonicus (122), and Nasutitermes termite hindgut metagenome community (704) respectively. The fraction of total GH domains with
specificity toward cellulose and xylan is indicated below each species name.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org6July 2009 | Volume 4 | Issue 7 | e6085
one or more carbohydrate-binding modules or other small
domains of unknown function. It is relatively unusual for such
enzymes to include multiple catalytic domains and still less
common for those domains to differ in substrate specificity (BH &
PMC, unpublished). A small number of such multifunctional,
multidomain glycosidases have been characterized experimentally,
including a cellulase (CelAB) from T. turnerae , a cellulase
(Cel5A) and a chitinase (ChiB) from S. degradans [21,24], and an
endoglycosidase from Enterococcus faecalis . The T. turnerae
genome encodes seven such multicatalytic enzymes (Figure 6).
The advantages of co-localizing distinct catalytic domains
within a single protein are unknown, but may reflect the special
problems encountered by organisms that depend on extracellular
degradation of complex insoluble substrates like wood. For
example, cellulose, a major constituent of wood, requires both
endoglucanase and cellobiohydrolase activities to convert the
insoluble polymer efficiently into soluble sugars that can be
transported across the cell envelope. Combining endoglucanase
and cellobiohydrolase activity along with carbohydrate binding
modules in a single molecule may ensure proximity of these
complementary catalytic domains, and therefore enhanced
activity, while also preventing diffusion of these proteins away
from their insoluble substrates.
The complete hydrolysis of hemicellulose also requires a
combination of hydrolytic activities, including glycoside hydrolases
and esterases. Hemicellulose in softwoods and hardwoods are
predominantly composed of O-acetyl-(4-O-methylglucurono)xylan
. In contrast, arabinoxylan is the major heteroxylan in grasses.
Consistent with its proposed specificity for degrading woody plant
and acetylxylan esterase domains (TERTU_3603, TERTU_1680,
and TERTU_1678), and a fourth encodes xylanase and methylglu-
curonoyl esterase domains (TERTU_3447). Although a few
multicatalytic enzymes with the former combination of specificities
have beendescribed previously, e.g. [27,28], this is the first report of
the latter combination.
Another proposed advantage of such multicatalytic proteins is
that they may promote intramolecular synergism . For
example, endoglucanases and cellobiohydrolases are thought to
display synergism because the former produces substrate (reducing
ends) that can be degraded by the latter. Indeed, such synergism
has been observed between naturally co-occurring endoglucanases
and exoglucanases or accessory enzymes such as cellodextrinases
and cellobiases in Fibrobacter , Clostridium , Cellulomonas 
and fungi . Moreover, cellulases and cellulase complexes
(synthetic cellulosomes) engineered to contain multiple enzymatic
specificities also display synergistic activity against cellulose 
and complex plant substrates [34,35].
The co-localization of multiple activities within a single protein
may also have advantages specific to symbiosis. It is thought that
enzymes produced by shipworm symbionts are transported by an
as yet unknown mechanism from the shipworm’s gills, where the
symbionts are found, to the digestive system where lignocellulose
degradation is thought to occur . Therefore, combining
multiple specificities within a single protein may simplify the task
of protein transport.
Carbohydrate binding modules.
of carbohydrate-active catalytic domains, the T. turnerae genome
also contains 117 domains predicted to bind carbohydrates,
second only to S. degradans. Most abundant are those predicted to
bind crystalline cellulose (CBM2 and CBM10), which account for
nearly 50% of all CBMs in the genome. These are also the most
abundant CBM types in S. degradans and C. japonicus. However,
relative to these other genomes, T. turnerae is enriched in the xylan
binding CBM family 22 (seven in T. turnerae, one in S. degradan,s
In addition to a diversity
Figure 6. Multicatalytic carbohydrate-active enzymes. The genome of T. turnerae encodes seven multicatalytic enzymes (enzymes containing
multiple catalytic domains with distinct predicted substrate specificities). Domain architecture is depicted in schematic form. Key: glycoside
hydrolases (GH), carbohydrate esterases (CE), polysaccharide lyases (PL), catalytic domains (white rounded rectangles), carbohydrate binding modules
(grey squares), secretion signals (dark circles), and polyserine linker regions (‘‘SS’’). Numbers specify domain families (http://afmb.cnrs-mrs.fr/CAZY/).
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org7July 2009 | Volume 4 | Issue 7 | e6085
and none in C. japonicus), consistent with a specialization for
degradation of hemicellulose or heteroxylan components of wood.
Interestingly, seventeen ORFs in the T. turnerae genome contain
CBMs that either lack associated catalytic domains or lack
domains with putative carbohydrase activity (GH, CE, or PL
enzyme functions, Supporting Information: Table S4). Indeed two
of these link CBMs to domains predicted to function as serine
proteases. The functions of these unassociated CBMs and unusual
hybrids are unknown, but may involve modification of the surface
of insoluble woody substrates, or modification of other proteins/
enzymes bound to these surfaces.
Polyserine linker domains.
carbohydrase domains and CBMs in T. turnerae are also unusual,
being uncharacteristically long and comprised nearly entirely of
serine residues. Linkers found in other carbohydrases commonly
consist of repeating proline, threonine and glycine residues. While
the functional significance of polyserine linkers (PSLs) is unknown,
they appear to be characteristic of the carbohydrases of T. turnerae
and its closest relatives C. japonicus and S. degradans [37,38].
The S. degradans genome encodes 46 carbohydrate active
proteins that contain PSLs. On average, these are composed of
80% serine residues and are 39 residues in length . Similarly,
T. turnerae encodes 42 carbohydrate active proteins containing
PSLs with an average serine content of 83% and average length of
44 residues. Approximately one-third of GH and CE domain-
containing proteins are linked by PSLs to one or more CBMs, as
are most (75%) PL domains. Of the multidomain glycoside
hydrolases, .80% of these also exhibit polyserine linker regions
while all of the multidomain carbohydrate esterases and
polysaccharide lyases contain PSLs.
The linker regions that join
Bioinformatics evidence for secretion system gene
In Gram-negative bacteria, several secretion systems are
available to catalyze the extracellular translocation of proteins
and genetic material. For some secretion systems, (e.g. the type II
secretion system, or T2SS), proteins must first be exported across
the inner membrane to the periplasm using the so-called
‘‘conventional’’ or ‘‘broad-specificity’’ Sec pathway before trans-
location across the outer membrane. Other secretion systems use
complex multi-component protein assemblies that bypass the Sec
pathway requirement and directly translocate proteins from the
cytoplasm to the extracellular milieu. Specialized secretion
systems, such as types III, IV, and VI (T3SS, T4SS, and T6SS)
are often hallmarks of intracellular bacterial pathogens and
symbionts , and many of the secretion system substrates,
termed ‘‘effectors,’’ have been shown to modify or disrupt host cell
Type 2 secretion systems.
both Sec and Sec-independent (twin-arginine, Tat) pathways
contribute to protein translocation across the inner member in T.
turnerae (for reviews, see [43,44,45]). All essential components of the
Sec translocase encoded by secA, secY, secE, and secG, as well as gene
products of secD, secF, yajC and yidC that are peripherally associated
with the translocase have been identified. Both the co-
riboprotein and the FtsY receptor) and the postranslational
pathway (involving the SecB chaperone) of translocase targeting
appear to be present. Over 20% of the predicted proteins in the T.
turnerae genome were identified with N-terminal signal peptides by
SignalP 3.0  and are predicted to be Sec pathway substrates.
Proteins deposited in the periplasm by the Sec and Tat pathways
may be translocated through the outer membrane by the T2SS,
also known as the general secretory pathway. All known essential
Genome sequence suggests that
(involving the Ffh/SRP54
components of T2S are arranged in a single large operon
(gspCDEFGHIJKLMN) in the
previously observed in S. degradans.
Type 3 and type 4 secretion systems.
are an important means by which intracellular symbiotic and
pathogenic bacteria modulate interactions with their hosts. Two of
these, the T3 and T4 secretion systems, are known to play
important roles in many plant and animal symbionts and
pathogens . For example, T3 and T4 pathways are involved
in the establishment of symbiosis by the tsetse fly endosymbiont
Sodalis glossinidius and the plant endosymbiont Mesorhizobium loti
[48,49], respectively. Surprisingly, elements encoding T3 or
T4SSs are absent from the genome of T. turnerae.
Type 6 secretion systems.
of the T6SS, a Sec-independent secretion system only recently
described in Gram-negative bacteria [50,51], were identified in the
genome of T. turnerae (TERTU_1668-TERTU_1639). Reports
indicate that T6S can perform a myriad of functions including
promoting biofilm formation, and attenuating or enhancing
virulence [50,52,53,54,55]. A protein secreted from Vibrio cholerae
in a T6SS-dependent manner can crosslink actin in macrophages
, thereby indicating that T6SSs are likely to directly
translocate protein(s) into host cells. The fact that T. turnerae
resides inside host cells as an endosymbiont and lacks T3 and T4
secretion suggests that T6S might be a central mechanism for the
initiation and maintenance of the symbiotic interaction. Indeed, it
has been demonstrated that T6S is a determining factor for host-
specificity in the symbiont Rhizobium leguminosarum [57,58].
In order to gain greater insight into the potential function of
T6S in T. turnerae, we compared its T6SS locus structure and gene
content to that of P. aeruginosa HSI-I, a well-characterized T6S
locus, and to the T6S locus of its close relative, S. degradans
(Figure 7). The T6S gene cluster in T. turnerae appears to encode all
known essential proteins of the secretory apparatus (E value: ,1e-
10) (Figure 7A). Among these are genes encoding proteins with
homology to IcmF (TssM) and an AAA+-family ATPase (TssH),
which are hallmarks of the T6SS . The T6SS generally
translocates at least two proteins: haemolysin-coregulated protein
[60,61,62,63]. Genes encoding these apparent substrates of the
system are present in the T6SS cluster of T. turnerae, and
furthermore, two other putative vgrG genes (TERTU_3731 and
TERTU_2226) are located outside of the unit.
Overall, these loci are highly similar, with 17 and 19 of the 22 P.
aeruginosa HSI-I genes conserved in T. turnerae and S. degradans,
respectively. Surprisingly, our analyses indicated that the overlap
in T6S gene between these organisms is not restricted to those
genes that are widely conserved in other T6SSs. Moreover,
detailed comparisons of the proteins putatively involved in T6S
activation in T. turnerae with those of P. aeruginosa, provided
evidence that not only are essential proteins of this pathway
conserved and likely to be functional, but also, that the mechanism
of initiation and signal propagation may be as well. Sequence
alignment of Fha1 with TERTU_1647 (fha) showed that the site of
Fha1 phosphorylation (Thr362) is conserved in the T. turnerae
protein (Figure 7B). Likewise, similar to PpkA, a C-terminal
periplasmic extension was observed in its T. turnerae ortholog
(TERTU_1640). As is typical in T6S kinases, there is no significant
sequence similarity of this region between the species (data not
As observed previously in C. japonicus and S.
degradans, the majority of carbohydrate active enzymes encoded
by T. turnerae appear to be substrates of the type II secretion system
Interestingly, however, elements
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org8 July 2009 | Volume 4 | Issue 7 | e6085
(T2SS). Of the 99 GH and the 22 CE genes, 73 and 14 proteins
respectively are predicted to have an N-terminal signal peptide for
secretion. Notably, four out of five candidate beta-glucosidases do
not have T2SS consensus signal peptides, suggesting that
cellobiose and/or larger polymers may be targeted by a
phosphotransferase transport system for further degradative
processing in the cytoplasm. Consistent with this notion, a
candidate cellobiose phosphotransferase gene (TERTU_3237),
(TERTU_2767, TERTU_2762, and TERTU_0851) have been
identified in T. turnerae.
Over 180 proteins encoded by the Teredinibacter genome are
predicted to be lipoproteins (LipoP 1.0 program ) and so are
likely to become anchored to the outer face of the outer
membrane. Of these, 23 are associated with polysaccharide
degradation. Localization of lignocellulose active proteins to the
outer membrane may provide some of the functional advantages
to Gram-negative bacteria that cellulosomes provide for Gram-
Experimental analysis of the T. turnerae secretome.
total of 123 proteins were identified under the assayed growth
conditions (Figure 8). Based on sequence data, most were
predicted to be secreted proteins or lipoproteins, although a
small proportion had no predicted N-terminal signal peptides and
were presumed to be cytoplasmic proteins released by cell lysis.
The number of proteins detected in spent medium after growth on
SigmaCell (72 proteins), was more than three times greater than
that observed after growth on carboxymethyl cellulose (25
proteins) or sucrose (23 proteins), possibly reflecting the greater
demands of degrading insoluble polysaccharides. Two proteins, a
predicted cellodextrinase (TERTU_0427) and a xylose isomerase
(TERTU_4666), were expressed under all conditions tested.
detected only with SigmaCell as the sole carbon source (Table 3).
Detection of cellodextrinase and xylose isomerase under all growth
conditions suggests that, as in other woody biomass-degrading
required for cellulose and xylan metabolism may be linked. This
may be resolved by further transcriptomic and/or proteomic
hemicellulose components. It should be noted that the methods
used here target secreted proteins and may not efficiently detect
cytoplasmic or periplasmic proteins, or proteins that bind
irreversibly to insoluble substrates.
certain enzymatic activities
on purified cellulose and
substrate. Therefore, organisms that utilize wood as food may
benefit from alternate sources of nitrogen nutrition such as
nitrogen fixation. The genome of T. turnerae revealed a complete
set of nitrogen fixation genes (nif) organized in three main clusters
that span nearly 60,000 bp, or 1% of the genome. The first cluster
contains nitrogenase accessory and regulatory genes including
nifQ, nifBAL, andthe electron
rnfABCDGE. The second cluster
nitrogenase genes encoded by the nifHDKT operon. The third
cluster contains genes nifENX whose gene products function to
synthesize nitrogenase molydenum-iron cofactors, as well as
nifUSVPWZM whose gene products also function in nitrogen
fixation. The organization of the later gene cluster is particularly
similar to the nif gene arrangement in strains of Azotobacter,
diazotrophic Gram-negative gamma-proteobacteria related to
Pseudomonads. Consistent with this observation, sequence
Wood is a carbon-rich but nitrogen-poor
Figure 7. Type 6 secretion system (T6SS) gene clusters. (A) Schematic representation of T6SS clusters of T. turnerae, P. aeruginosa and S.
degradans. The T6SS units are arranged so that the fha ortholog of each species is central. Genes are identified by locus tag numbers (below) and
according to the standardized nomenclature for T6SS proposed by Shalom et al.  (above): tss; core components in all T6SS units; tag: tss-
associated genes which are present in T6SS clusters in more than one bacterial species. Genes highlighted in color are discussed in the text and
homologous genes are represented by the same color. Genes indicated with light grey have not been clearly linked to T6S function or are not widely
conserved in T6S loci. T6SS core genes  are indicated in bold type. (B) Partial sequence alignment of the FHA domain-containing proteins
involved in posttranslational regulation of T6S in T. turnerae and P. aeruginosa. The critical phosphorylation site (Thr362) is conserved in the T.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org9July 2009 | Volume 4 | Issue 7 | e6085
analysis of the T. turnerae NifH protein revealed that the closest
related sequence is the nifH gene from Pseudomonas stutzeri. These
data, along with the absence of nif genes in S. degradans, H. chejuensis,
and C. japonicus, are consistent with the notion that the nif cluster in
T. turnerae was acquired via horizontal gene transfer from a
Pseudomonas-like bacterium, as has been proposed for nif genes in
other microbes .
In addition to genes involved in
nitrogen fixation, about 40 genes in the genome of T. turnerae are
predicted to function in nitrogen assimilation. The majority of
these are dedicated to urea metabolism and transport. Specifically,
an operon containing urease and accessory genes (ureABCEFG) is
flanked by two operons (urtABCDE) dedicated to the energy-
dependent transport of urea. Urease and urea transporter genes
are also found in the genome of S. degradans and Hahella chejuensis,
suggesting that urea metabolism may be common to this lineage of
The T. turnerae genome also encodes eight genes (TERTU_3348,
TERTU_0619, TERTU_4234, TERTU_3878, TERTU_3828,
TERTU_2171, TERTU_1871, and TERTU_1053) predicted to
encode carbon-nitrogen (C-N) hydrolases. This prediction is
supported by the presence in each of these genes of a diagnostic
triad of conserved amino acid residues (Glu-Lys-Cys) required for
attack on cyano- or carbonyl carbon substrates. C-N hydrolases are
members of a protein superfamily containing 13 families, one of
which is responsible for nitrilase activity . A putative nitrilase
function was assigned to a T. turnerae C-N hydrolase gene,
TERTU_4234 based on protein identity (81%) to a functionally
characterized nitrilase (AAR97393) . TERTU_4234 is part of a
seven-gene operon named Nit1C (Supporting Information: Figure
S2) that is evolutionarily conserved from cyanobacteria through
myxobacteria. It has been proposed that Nit1C genes may be
involved in detoxification of xenobiotic compounds from plants and
microbes  and in the production of a PKS/NRPS hybrid
antibiotic cystothiazole A (Feng et al., 2005). Nitrilases are
commercially important in the production of acrylamide and other
fine chemicals [68,69].
Figure 8. Secretome of T. turnerae. Venn diagram depicting proteins
expressed during growth with indicated carbon sources. Numbers in
non-overlapping regions enumerate proteins that were uniquely
expressed and secreted under the indicated condition. Numbers in
overlapping regions enumerate proteins expressed and secreted under
multiple growth conditions.
Table 3. Examples of T. turnerae secreted carbohydrate active enzymes and associated proteins*.
ORFPredicted Function Domain ArchitectureAAMW (kD) LOCSUC CMCSIG
TERTU_2703 carbohydrate esteraseCE3-CBM10-CBM2486 49S
TERTU_0645 endoglucanase GH9-CBM10-CBM2876 91S
TERTU_0607 endoglucanase GH958063 S, L
TERTU_0427cellodextrinaseCBM5-CBM10-GH5 699 75S
TERTU_0046 chitin/cellulose binding proteinCBM33-CBM10 332 35S
TERTU_4506 xylanaseGH8436 49 S, L
TERTU_3603acetylxylan esterase-xylanase CE6-CBM5-CBM10-GH10952100S
TERTU_3565endoglucanse CBM2-GH5628 68S
TERTU_2893endoglucanse GH9-CBM3-CBM5-CBM10 87593S
TERTU_2567 carbohydrate binding protein CBM10-CBM51324 432S
TERTU_2546xylanase CBM2-CBM10-GH10 62966S
TERTU_1675 beta-1,3-glucanaseGH16-CBM6 33837S
TERTU_1599 xylanase GH10-CBM6-CBM22-CBM22955103S
TERTU_1498 alpha-glycosidaseGH31 977 110S, L
TERTU_0766carbohydrate binding proteinCBM13-NPP139243S
TERTU_2895endo- and exoglucanaseGH5-CBM5-CBM10-GH61010 106S
*As determined by 2-D LC MS/MS on spent culture medium (see methods). Abbreviations: number of amino acid residues (AA), molecular weight (MW), predicted
protein localization (LOC), secreted (S), lipoprotein (L), sucrose (SUC), carboxymethycellulose (CMC), and SigmaCell (SIG).
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org 10 July 2009 | Volume 4 | Issue 7 | e6085
Many symbiotic proteobacteria rely on quorum sensing, a
mechanism of bacterial population density-dependent gene
regulation, to structure their host-associated microbial communi-
ties. The genome of Teredinibacter does not appear to encode
homologues of known LuxI or LuxM-type AHL synthases, which
are commonly required for the synthesis of N-acyl homoserine
lactone quorum sensing signals. However, it is worth noting that
Nitrosomonas europea produces three AHLs, even though the genome
of N. europea does not contain a canonical luxI or luxM . A
LuxR-type protein (TERTU_2802), weakly homologous to known
AHL receptors, is encoded within the Teredinibacter genome. It is
not yet known whether this ‘‘orphan’’ LuxR protein may function
as a receptor for AHLs produced by other bacteria. Some gamma-
proteobacteria (Salmonella, E. coli, Klebsiella) do not produce AHLs
themselves, but have functional AHL receptors that detect AHLs
produced by other bacteria within a microbial community .
The AHL receptor gene (sdiA) in these bacteria is considered to be
a horizontal acquisition that followed a loss of the conserved luxI-
luxR cluster . The genome also does not appear to encode a
homologue of LuxS, a synthase of a furanone AI-2 signal .
In addition to AHL- or AI-2- mediated quorum sensing, social
behaviors in all gamma-proteobacteria are mediated by the
orthologues of the GacS/GacA two-component system [73,74].
(TERTU_1191) and a GacA orthologue (TERTU_2408). The
predicted periplasmic loop of GacS and its cytoplasmic linker
domain (responsible for the interactions with the yet unknown
signal)  appear to be the least conserved in this family of
orthologues. It is, therefore, not yet clear whether GacST.t.
responds to the same self-produced signal that was initially
characterized in pseudomonas . Similar to other orthologues,
GacST.t.contains H302, D720 and H878, predicted to function in
autophosphorylation and phosphotransfer to GacA. The GacA is
most likely functional since it contains D54 (a predicted
phosphorylation site) and the highly conserved amino acid residues
(D8-D9, P58-I61, T82-D86) that are predicted to interact with
D54. The GacS/GacA-mediated signal transduction in gamma-
proteobacteria requires an RNA binding protein CsrA (=RsmA),
which interacts with the small regulatory RNAs controlled by
GacS/GacA. The genome of T. turnerae is unusual in that it
appears to contain two CsrA homologues (TERTU_2809 and
TERTU_2436). TERTU_2809 was most similar to the annotated
csrA (rsmA) from Saccharophagus degradans, Cellvibrio japonicus, and
Pseudomonas mendocina. TERTU_2436 does not appear to have
orthologues in these related bacteria and is most similar to
TERTU_2809. The GacS/GacA-Csr system also contributes to
the regulation of genes involved in utilization of various carbon
sources and secondary metabolism .
encodesa GacS orthologue
The T. turnerae genome contains nine gene clusters predicted to
encode secondary metabolite pathways, including multifunctional
and modular polyketide synthase (PKS) and non-ribosomal
peptide synthase (NRPS) enzymes, which are typically involved
in the production of bioactive molecules (Figure 9). Clusters were
delimited as groups of genes with homologues among secondary
metabolite pathways. The clusters range in size from 8 Kb
(Region 9) to 74 Kb (Regions 3 and 4) and contain many large
ORFs, such as TERTU_2858, a remarkable ,22 Kb in length
(Figure 9). Modular PKS and NRPS are enzymes that devote
separate modules, consisting of groups of catalytic domains, to
each elongation step of the growing molecular chain. Thus, size of
these enzymes is correlated to the size of the metabolites. The size
and complexity of some of the clusters suggests that large, complex
metabolites are likely to be produced by T. turnerae. The combined
putative secondary metabolite pathways account for approximate-
ly 380 Kb, or nearly 7% of the T. turnerae genome. Thus, the
fraction of the genome devoted to secondary metabolism in T.
turnerae is comparable to that found in several Streptomyces species
considered to be secondary metabolism specialists [76,77].
A detailed analysis of the secondary metabolome of T. turnerae
T7901 is beyond the scope of this work and will be presented in a
separate manuscript; an overview is presented here. Three of the
clusters, 4, 6 and 7, are NRPS clusters, without PKS elements.
Region 7 contains genes homologous to biosynthesis of a
enterobactin-like catecholate siderophore, including a Dhb/Ent-
F-like NRPS (TERTU_4067) and a entCEBA-like operon required
for production (through conversion of the aromatic amino acid
chorismate) and activation of DHBA (TERTU_4059–4062)
(Figure 9, Region 7 ORFs in green) . This region, predicted
to be responsible for siderophore biosynthesis and iron transport, is
the only one among the 9 detected clusters for which prediction of
the compound class was possible based on BLAST analysis of
ORFs. Completing the T. turnerae secondary metabolome are one
small (Region 9) and two large (Region 2 and 6) modular PKS
gene clusters. Regions 1, 3 and 8 are mixed clusters, containing
genes homologous to both NRPS and modular PKS functions
The large number and high complexity of NRPS and PKS
clusters observed in T. turnerae is in sharp contrast to that seen in
the sequenced genome of S. degradans. In an analysis of S. degradans
2–40 genomic database (http://genome.jgi-psf.org/finished_mic-
robes/micde/micde.home.html), we detected 5 loci containing
ORFs coding for hypothetical PKS/NRPS enzymes. These
regions, which total ,73 Kb in size, represent 1.43% of the S.
degradans 2–40 genome. Only 3 of the putative enzymes (Sde_0688,
Sde_0689 & Sde_3725) exceed 2000aa in length and only one
(Sde_3725) is structurally more complex than a mono-modular
type I PKS. The predicted siderophore biosynthesis cluster in S.
degradans is smaller (17,690 bp) than its T. turnerae (32,073 bp)
counterpart. It remains to be determined whether the greater
complexity of these genes in T. turnerae is correlated with host
Comparative analyses suggest that the NRPS and PKS clusters
of T. turnerae may have phylogentic origins distinct those of S.
degradans. As previously mentioned nearly half of all ORFs in the T.
turnerae genome with significant BLAST hits to another sequenced
genome have best hits to homologues in S. degradans. This is not the
case for PKS/NRPS clusters. For example, regions 1 and 2 are
PKS and/or NRPS enzymes with significant similarities (45–50%)
to recently characterized enzymes of Bacillus amyloliquefaciens from
the difficidin (Dfn), bacillaene (Bae), and bacillomycin D (Bmy)
biosynthesis pathways. These observations may suggest that some,
or possibly all, of the T. turnerae secondary metabolite regions may
have originated from lateral gene transfer events.
Bacteriophage, restriction-modification and mobile-DNA
The intracellular environment of T. turnerae is thought to
provide it limited exposure to mobile genetic elements. Nonethe-
less, the genome of T. turnerae indicates a history of exposure to
foreign genetic material, including bacteriophage (Supporting
Information: Table S5) and transposable elements. Two gene
clusters in the T. turnerae genome, cas/csd and cmr, are predicted to
function in generating and maintaining clusters of regularly
interspaced short palindromic repeats (CRISPRs) (Supporting
Information: Figure S3A). CRISPR and associated genes provide
bacteria with acquired immunity against infection by bacterio-
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org 11July 2009 | Volume 4 | Issue 7 | e6085
phage, and possibly other extrachromosomal elements , via an
incompletely understood mechanism involving incorporation of
short non-coding and non-repetitive spacer sequences derived
from phage during viral challenge .
Based on the arrangement, orientation and sequence of
CRISPR-associated genes, including Cas5 (TERTU_3118), it is
likely that T. turnerae encodes the Dvulg subtype of the Cas guild
. Thirty-one units of direct repeats, each 32 bp in length, were
identified downstream of the cas2 gene. Each of these CRISPR
repeats is capable of forming a hairpin structure when transcribed,
consisting of a 7 base-pair stem and a loop of 5 nucleotides
(Supporting Information: Figure S3B). As has been described for
other CRISPR bacterial systems, the terminal repeat sequence is a
variant of other repeats in the unit . Additionally, the
secondary structure of the T. turnerae repeats is consistent with
the cluster 3 type of CRISPRs, which are also associated with
Dvulg Cas genes . CRISPR and associated genes have been
identified in the C. japonicus genome  but not in S. degradans.
The T. turnerae genome encodes 30 CRISPR spacers with a
mean length of 34 nucleotides. Two of these spacers closely
matched (20 of 22 base pairs) regions of the Vibrio phage VHML
encoding a putative baseplate spike protein (orf30) and the
Ralstonia phage phiRSA1 encoding an endonuclease subunit
(ORF9) respectively, however, exact cognate bacteriophage
sequences were not identified in public databases.
DNA restriction-modification (R-M) systems have also been
proposed as a mechanism of bacteriophage immunity. While it
remains challenging to identify endonuclease (R) genes bioinfor-
matically, nucleic acid methylase (M) genes are well conserved.
There are at least two R-M systems in the T. turnerae genome
anchored by methylase genes TERTU_3913 and TERTU_2390.
Both are predicted to be type I R-M systems, and as such are
associated with genes that encode specificity (S) subunits. In
comparison, two type I R-M systems and two type III methylases
were identified in C. japonicus (data not shown). No R-M associated
methylase genes were identified in S. degradans.
Detailed genomic information is available for comparatively few
intracellular endosymbionts. Therefore, a central question with
regard to each of these genomes is to what extent does genome
content and organization reflect adaptation to this ecologically
important niche. Unlike most intracellular symbionts examined to
date, T. turnerae is capable of growth in vitro under simple defined
conditions. However, despite considerable effort, this bacterium
has never been isolated from any environment other than the gill
tissue of teredinid bivalves. Thus, although T. turnerae appears to be
capable of independent existence, the extent to which this
bacterium may grow and/or reproduce outside of its host in
nature remains unknown.
Figure 9. Secondary metabolism gene clusters of T. turnerae. Predicted secondary metabolite gene clusters in the genome of T. turnerae
T7901. Regions are shown in order of distance from the origin of replication. Sequence coordinates of each region are indicated beneath the region
number. NRPS: nonribosomal peptide synthetase, PKS, polyketide synthase.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org12 July 2009 | Volume 4 | Issue 7 | e6085
Genome content and organization may provide evidence of the
extent to which T. turnerae is restricted to the intracellular
environment. Considerable attention has been paid to intracellular
pathogens and endosymbionts of animals and the broad syndrome
of genomic modifications associated with increasing dependence
on intracellular existence. Such modifications may include
reduction in genome size, decreased %G+C, increased fixation
of harmful mutations, loss of genes of core metabolism including
DNA repair, formation of pseudogenes, transitory proliferation of
insertion elements, and reduction in number of ribosomal operons
and tRNA genes [9,47,84].
Contrary to observations on the genomes of other bacteria that
form stable, long-term, intracellular associations with animals, the
circular genome of T. turnerae is comparatively large. In fact, it is
larger and contains significantly more predicted open reading
frames than has been reported for its closest known free-living
relative, S. degradans (Table 1). Also, unlike other intracellular
symbionts, the genome of T. turnerae shows no reduction in %G+C
compared to its closest free-living relative but is in fact 5% greater.
It contains a greater number of ribosomal RNA genes and a
comparable number of tRNA genes. Moreover, T. turnerae
maintains a complete complement of genes involved in virtually
all core metabolic functions including DNA repair and contains
only three predicted pseudogenes as compared to nine in S.
degradans. Although the number of insertion elements detected in
T. turnerae significantly exceeds that of S. degradans, this number is
consistent with the profiles of pathogens, such as Listeria
monocytogenes , and is far less than that determined for recently
examined facultative endosymbionts . Also contrary to
expectations for an intracellular endosymbiont, the genome of T.
turnerae suggests a recent history of exposure to bacteriophage. The
T. turnerae genome encodes and maintains a phage defense arsenal
that includes CRISPR and at least two restriction modification
systems and contains a number of phage elements similar to that of
its free-living relative, S. degradans.
Another interesting feature of the T. turnerae genome is the
magnitude of its secondary metabolic potential. A proposed
function of secondary metabolites is to act as antimicrobials to
suppress competition from other bacteria. This again could suggest
recent or current competition with free-living bacteria. However,
secondary metabolites might also play a role in symbiosis. It should
be noted that T. turnerae is closely related to Candidatus Endobugula
sertula, a bacterial symbiont of bryozoans. This symbiont is
proposed to be responsible for the biosynthesis of bryostatins,
polyketides that protect the bryozoan larvae from predation. The
metabolites of T. turnerae may play a similar role in the shipworm
symbiosis. Alternatively, they could suppress microbial competitors
for infection of the host tissues or digestive system, defend the host
against pathogens, or play a role in communication with the host.
Finally, shipworm gills typically contain several related bacterial
species in addition to T. turnerae. Thus secondary metabolites may
be important in community assembly within the gill tissue by
regulating competing populations of coexisting endosymbionts.
In summary, the genome of T. turnerae suggests that of a
facultative endosymbiont that either maintains a significant
ecological niche outside of its host, or of a bacterium that has
only recently become restricted to the intracellular environment.
We can identify no feature, or combination of features, of its
genome content or organization that uniquely identifies T. turnerae
as a symbiotic bacterium. Nor can we identify, based on
bioinformatics, any gene that can uniquely be identified as a
‘‘symbiosis gene’’, although there is some indication that such
genes may exist. Given the widespread occurrence and prevalence
of T. turnerae among phylogenetically and biogeographically
diverse shipworm taxa, and the long fossil history of these host
groups, we favor the hypothesis that T. turnerae has maintained a
long history of relatively independent facultative association with
shipworms. It remains unclear why the evolutionary trajectory of
this symbiosis has not lead to evidence of greater host dependence.
Additional, comparative genomic analysis of cultivated and as yet
uncultivated shipworm symbiont strains and species may lead to a
better understanding of this elusive question.
Materials and Methods
Provenance of the sequenced strain
The sequenced strain T. turnerae T7901 was isolated by John
Waterbury, Woods Hole Oceanographic Institution from a
specimen of the shipworm Bankia gouldi collected from the Newport
River Estuary, Beaufort North Carolina in 1979. This strain, and
53 similar strains of T. turnerae isolated from a variety of shipworm
species by Waterbury et al. [1,2] between 1979 and 1986, have
been deposited to the Ocean Genome Resource, (OGR accession
number I00002) a public biorepository operated by Ocean
Genome Legacy, Inc. The sequenced strain has also been
deposited to the American Type Culture Collection (accession
Cultivation of Teredinibacter turnerae
Strains were grown in liquid batch culture in shipworm basal
medium (SBM) as previously described . SBM was supple-
mented with sucrose (final concentration 0.5%) and ammonium
chloride (final concentration 5 mM). Difco agar (1%) was added
for plate cultures. For genomic DNA extractions, a single colony of
T. turnerae was used to inoculate 100 ml aliquots of SBM. Cultures
were incubated with mild agitation (100 rpm) at 30uC until optical
densities between 0.08 and 0.10 OD600 units were achieved. The
resulting cell pellet was harvested by centrifugation at 12,000 rpm
for 15 minutes.
Genomic DNA preparation
A cell pellet of approximately 125 ml in volume was resuspended
in 4.75 ml of TE 8.0 buffer (10 mM Tris, pH 8.0, 1 mM EDTA).
After addition of 1.25 ml of 10% SDS and 12.5 ml of Proteinase K
(20%), the suspension was mixed by inversion and incubated at
37uC for 60 minutes. To this suspension was added 600 ml of NaCl
(5 M) and 375 ml CTAB (10% in 0.7 M NaCl) prewarmed to
65uC. After incubation at 65uC for 20 min., the suspension was
allowed to cool to room temperature, 6 ml of dichloromethane
was added. The suspension was then mixed by inversion, phases
were separated by centrifugations (80006g, 15 min), and the
aqueous phase was retained and subjected to 2 additional rounds
of extraction with dichloromethane. Nucleic acids were precipi-
tated from the solution by addition of 0.65 volumes of isopropanol,
spooled on a glass rod, washed by submersion in EtOH (100%),
dissolved in 500 ml of TE 8.0 buffer containing 1 ml RNase
(100 mg/ml), and incubated at 37uC for 60 min. After addition of
NaOAc (0.3 M final concentration), two volumes of EtOH
(100%), were added and the DNA was spooled onto a glass rod,
washed with EtOH (100%) and dissolved in 500 ml of TE 8.0
buffer. Genomic DNA was subsequently purified using DNeasy
mini spin columns (Qiagen) according to the manufacturer’s
Genome sequencing and assembly
The complete genome sequence was determined using a
combination of Sanger  and Roche-454 GS20  technol-
ogies as described in . Three libraries were made – a small
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org13July 2009 | Volume 4 | Issue 7 | e6085
insert library of 3 to 4 Kb, a medium insert library of 8 to 10 Kb,
and a fosmid library with inserts of 33–39 Kb. A total of 6144
Sanger reads were performed for each library. A single plate of
454 GS20 sequencing (158,697 reads) was then performed to
supplement the Sanger sequence. The Celera Assembler  and
JCVI’s in-house hybrid assembly method were used to assemble
the combined Sanger/454 data, resulting in three sequence
scaffolds containing 28 intra-scaffold gaps. The total number of
sequence gaps was reduced to 13 using JCVI’s automated closure
procedure (Hostetler et al., unpublished) which analyzes assembly
results, identifies finishing targets, designs primers, selects clones,
and chooses and performs sequencing reactions in an automated
pipeline. The remaining sequence gaps and low sequence coverage
areas were resolved manually using a combination of primer
walking, PCR and transposon-mediated sequencing. The jump-
start feature of the Celera Assembler was used for final sequence
assembly achieving a final coverage of 21.2X.
An initial set of predicted protein-coding regions was identified
using GLIMMER [91,92]. Those shorter than 30 amino acids and
those with overlaps of higher scoring regions were eliminated. The
likely origin of replication was identified and base pair 1 was
designated adjacent to the dnaA gene . Putative protein
functions were assigned using JCVI’s AutoAnnotate pipeline
which searches against an in-house non-redundant protein
database using BLASTP  then extends the BLASTP results
using the BLAST-Extend-Repraze (BER) method to improve
identification of gene boundaries. Putative proteins were also
compared to two sets of hidden Markov models (HMMs): Pfam
HMMs , and TIGRFAMs  using the HMMER package
. HMMs were built from highly curated multiple alignments of
proteins thought to share the same function or to be members of
the same protein family. Approximately 70% of putative proteins
whose functions were predicted by auto-annotation were subjected
to individual manual inspection and ‘‘expert’’ annotation by the
authors. The complete genome sequence has submitted to
GenBank (accession # CP001614).
Protein expression/secretion was examined in T. turnerae grown
with three different carbon sources. After two days of growth in
liquid medium containing either sucrose, carboxymethylcellulose
(CMC, a soluble form of cellulose), or SigmaCell (SMC, insoluble
cellulose powder) as the sole carbon source, spent Teredinibacter
turnerae fermentation medium was cleared of cells by centrifugation
and used for proteomic analysis. An 800 mL aliquot of each
cleared fermentation medium was injected onto an 1100/1200
Series Liquid Chromatography System (Agilent Technologies) and
separated on a PLRP-S reversed-phase column (1 mm6150 mm;
Higgins Analytical, Inc.) using a 45 min 15–60% TB gradient
(TA=0.1% trifluoroacetic acid, TB=CH3CN, 0.1% trifluoroa-
cetic acid) at a flow rate of 100 mL min21. Fractions containing
protein were identified by UV (214 nm) and the intensity of
absorbance at this wavelength was used to determine the protein
concentration in each fraction. Fractions were individually dried to
completion under vacuum. Proteins from each dried fraction were
resuspended in 30 mL Trypsin reaction buffer (50 mM Tris-HCl,
20 mM CaCl2, pH 8) and digested overnight at 37uC with 100 ng
of Trypsin (New England Biolabs, Inc.). Eight mL of each digested
fraction was injected into a HPLC-Chip Cube system and
separated on a Protein ID chip comprised of a 40 nL enrichment
column, a 75 mm6150 mm separation column packed with
Zorbax 300SB-C18 5 mm material, and a spray emitter with a
15 mm flow path (Agilent Technologies). Peptides were separated
using a 40 min 5–45% FB linear gradient (FA=0.1% formic acid,
FB=CH3CN, 0.1% formic acid) at a flow rate of 0.4 mL min21
and analyzed online by a 6330 Ion Trap Mass Spectrometer with
a Nano-Electrospray (nanoESI) ionization source (Agilent Tech-
nologies). A capillary voltage of 1700–1900 V (optimized on a per-
chip basis) was used and the skimmer voltage was held at 30 V.
Data were acquired at 25,000 m/z?sec21with a SmartTarget
value of 500,000 and Maximum Accumulation Time of 200 ms.
The MS acquisition range was from 300 to 1800 m/z. Default
parameters for Auto MS2were used. Ions were selected for
fragmentation based on their intensity, with the number of
precursor ions set to 5. The MS/MS Fragmentation Amplitude
was set to 1.30 V with SmartFrag Start and End Amplitude values
set to 30 and 200%, respectively. The MS/MS acquisition range
was from 50 to 2200 m/z. Data acquisition was coordinated with
the start of the LC separation and was stopped after 60 min.
Protein separation, digestion and peptide analysis were repeated in
The ESI-MS/MS data were analyzed using both Spectrum Mill
(Agilent Technologies) and Mascot (Version 2.2, Matrix Science
Ltd.) search engines . For the Spectrum Mill analysis, the raw
data were processed by Spectrum Mill MS Proteomics Workbench
(Rev A.03.02.060b). The default settings in Data Extractor were
used to prepare MS/MS data files for Spectrum Mill processing.
Processed data were then subjected to an MS/MS Search using a
T. turnerae database. The search criteria were set to allow two
missed cleavages by trypsin with no protein modifications. The
precursor mass tolerance and product mass tolerance were set
to62.5 and60.7 Da, respectively. Peptides were filtered by a
Score .7 and a % SPI.60 and only proteins scoring better than
20 were considered valid identifications. For the Mascot analysis,
raw data were converted to .mgf files by DataAnalysis (Version
6.1, Agilent Technologies). Compounds were detected using the
following method parameters: S/N threshold set to 2, Intensity
threshold set to 100 for both positive and negative, and a
maximum number of 8000 compounds with a retention time
window of 0.05 min. The .mgf files were uploaded to Mascot and
searched against a T. turnerae database. The search criteria were set
to allow two missed cleavages by a semi-trypsin digest with no
protein modifications. The tolerances for peptide and MS/MS
were set to 1.2 and 0.6 Da, respectively. Peptide charges of 1+, 2+
and 3+ were selected with MudPIT scoring and an ‘‘ion score or
expect cut-off’’ value of 20. Proteins identified with a Probability
Based Mowse score of 67 or better were considered valid
Found at: doi:10.1371/journal.pone.0006085.s001 (0.15 MB
Glycoside hydrolases of T. turnerae (99 ORFs total; 101
Found at: doi:10.1371/journal.pone.0006085.s002 (0.04 MB
Polysaccharide lyases of T. turnerae (4 ORFs total; 5
22 domains total).
Found at: doi:10.1371/journal.pone.0006085.s003 (0.06 MB
Carbohydrate esterases of T. turnerae (22 ORFs total;
associated with GH, PL and CE domains in T. turnerae (17 ORFs
Carbohydrate binding domain encoding ORFs not
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org 14 July 2009 | Volume 4 | Issue 7 | e6085
total, 24 domains total). PFAM-A abbreviations are used for non-
Found at: doi:10.1371/journal.pone.0006085.s004 (0.05 MB
Found at: doi:10.1371/journal.pone.0006085.s005 (0.04 MB
Prophage associated genes in T. turnerae.
turnerae T7901. Circular plots in order from outermost to
innermost rings: 1) and 2) predicted protein coding regions (blue),
tRNA genes (red), and rRNA genes (pink) in the forward and
reverse strands respectively, 3) local G+C content of the genome
(black), with high and low G+C regions represented by peaks
facing away from or toward the center of the figure respectively, 4)
GC-skew (positive values in green, negative in pink), and 5)
distance in base pairs from the predicted origin of replication.
Note that changes in the sign of GC skew correspond with and
support the predicted origin and terminus of replication.
Found at: doi:10.1371/journal.pone.0006085.s006 (2.15 MB TIF)
Circular representation of the chromosome of T.
context of the T. turnerae nitrilase gene (TERTU_04234) is shown
along with similar nitrilase operons (Nit1C) from other bacterial
genomes. Other proteins encoded by genes commonly found in
Nit1C clusters include 2 hypothetical proteins (hyp1 and hyp2), a
radical SAM superfamily protein (SAM, Pfam 04055), GCN-5
related acetyltransferse (GNAT, Pfam 00583), 59-phosphorybosyl-
5-aminoimidazole synthase-related proteins (AIRS, Pfam 00586),
and a putative flavoprotein (Flavo).
Nit1C gene cluster organization. The genomic
Found at: doi:10.1371/journal.pone.0006085.s007 (0.29 MB TIF)
turnerae. The CRISPR associated cas/csd and cmr loci are shown
(A). Genes belonging to different gene families are distinguished by
color (blue, cas; red, csd; purple, cmr). The predicted CRISPR
repeat RNA hairpin structure is shown (green) with the variant
terminal repeat (orange). Hairpin sequences are oriented with
respect to the cas operon, which is antisense to the genome.
Found at: doi:10.1371/journal.pone.0006085.s008 (0.19 MB TIF)
CRISPR associated genetic loci in Teredinibacter
We thank H. Parrot, R. Melo, A. Messelaar, M. Morris, R. Collins, and D.
Drolet of New England Biolabs for IT and facilities support during the
genome annotation, John Waterbury and Frederica Valois of Woods Hole
Oceanographic Institution for providing bacterial isolates used in this
investigation, and Jonathan Zehr of the University of California, Santa
Cruz for commentary on nif gene annotation. We also thank V. Losick for
critical review of this manuscript.
Conceived and designed the experiments: JY RM NAE JB NW JAE JHB
DLD. Performed the experiments: JY RM NAE CSP LF JB CM JN BS
NW NW JAE JHB DLD. Analyzed the data: JY RM ASD NAE CSP JBH
DR BST BH PMC SS LF AETS CAGS SE AH EWS MGH JP JB CM JN
BPA KC JF AH SK PAL YL BS NW BW MT JM NW JAE JHB DLD.
Contributed reagents/materials/analysis tools: JY RM ASD NAE CSP BH
PMC SS LF AETS CAGS SE AH MGH JP JB CM BS MT JM NW JAE
JHB DLD. Wrote the paper: JY RM NAE CSP SS MGH JB MT JM JHB
1. Waterbury JB, Calloway CB, Turner RD (1983) A Cellulolytic nitrogen-fixing
bacterium cultured from the gland of Deshayes in hipworms (Bivalvia:
Teredinidae). Science 221: 1401–1403.
2. Distel DL, Morrill W, MacLaren-Toussaint N, Franks D, Waterbury J (2002)
Teredinibacter turnerae gen. nov., sp. nov., a dinitrogen-fixing, cellulolytic,
endosymbiotic gamma-proteobacterium isolated from the gills of wood-boring
molluscs (Bivalvia: Teredinidae). Int J Syst Evol Microbiol 52: 2261–2269.
3. Distel DL, Beaudoin DJ, Morrill W (2002) Coexistence of multiple
proteobacterial endosymbionts in the gills of the wood-boring Bivalve Lyrodus
pedicellatus (Bivalvia: Teredinidae). Appl Environ Microbiol 68: 6292–6299.
4. Luyten YA, Thompson JR, Morrill W, Polz MF, Distel DL (2006) Extensive
variation in intracellular symbiont community composition among members of
a single population of the wood-boring bivalve Lyrodus pedicellatus (Bivalvia:
Teredinidae). Appl Environ Microbiol 72: 412–417.
5. Gallager SM, Turner RD, Berg CJ (1981) Physiological aspects of wood
consumption, growth, and reproduction in the shipworm Lyrodus pedicellatus
Quatrefages. Journal of Experimental Marine Biology and Ecology 52: 63–77.
6. Lechene CP, Luyten Y, McMahon G, Distel DL (2007) Quantitative imaging
of nitrogen fixation by individual bacteria within animal cells. Science 317:
7. Lynd LR, Weimer PJ, van Zyl WH, Pretorius IS (2002) Microbial cellulose
utilization: fundamentals and biotechnology. Microbiol Mol Biol Rev 66:
506–577, table of contents.
8. Clarke AJ (1997) Biodegradation of cellulose: enzymology and biotechnology.
Lancaster, PA: Technomic Publishing Co. 272 p.
9. Wernegreen JJ (2005) For better or worse: genomic consequences of
intracellular mutualism and parasitism. Curr Opin Genet Dev 15: 572–583.
10. Weiner RM, Taylor LE 2nd, Henrissat B, Hauser L, Land M, et al. (2008)
Complete genome sequence of the complex carbohydrate-degrading marine
bacterium, Saccharophagus degradans strain 2–40 T. PLoS Genet 4: e1000087.
11. Haygood MG, Davidson SK (1997) Small-subunit rRNA genes and in situ
hybridization with oligonucleotides specific for the bacterial symbionts in the
larvae of the bryozoan Bugula neritina and proposal of ‘‘Candidatus endobugula
sertula’’. Appl Environ Microbiol 63: 4612–4616.
12. Davidson SK, Allen SW, Lim GE, Anderson CM, Haygood MG (2001)
Evidence for the biosynthesis of bryostatins by the bacterial symbiont
‘‘Candidatus Endobugula sertula’’ of the bryozoan Bugula neritina. Appl
Environ Microbiol 67: 4531–4537.
13. Lopanik N, Gustafson KR, Lindquist N (2004) Structure of bryostatin 20: a
symbiont-produced chemical defense for larvae of the host bryozoan, Bugula
neritina. J Nat Prod 67: 1412–1414.
14. Lopanik N, Lindquist N, Targett N (2004) Potent cytotoxins produced by a
microbial symbiont protect host larvae from predation. Oecologia 139:
15. Sudek S, Lopanik NB, Waggoner LE, Hildebrand M, Anderson C, et al. (2007)
Identification of the putative bryostatin polyketide synthase gene cluster from
‘‘Candidatus Endobugula sertula’’, the uncultivated microbial symbiont of the
marine bryozoan Bugula neritina. J Nat Prod 70: 67–74.
16. DeBoy RT, Mongodin EF, Fouts DE, Tailford LE, Khouri H, et al. (2008)
Insights into plant cell wall degradation from the genome sequence of the soil
bacterium Cellvibrio japonicus. J Bacteriol 190: 5455–5463.
17. Jeong H, Yim JH, Lee C, Choi SH, Park YK, et al. (2005) Genomic blueprint
of Hahella chejuensis, a marine microbe producing an algicidal agent. Nucleic
Acids Res 33: 7066–7073.
18. Stewart FJ, Cavanaugh CM (2007) Intragenomic variation and evolution of the
internal transcribed spacer of the rRNA operon in bacteria. J Mol Evol 65:
19. Ekborg NA, Taylor LE, Longmire AG, Henrissat B, Weiner RM, et al. (2006)
Genomic and proteomic analyses of the agarolytic system expressed by
Saccharophagus degradans 2–40. Appl Environ Microbiol 72: 3396–3405.
20. Howard MB, Ekborg NA, Taylor LE, Weiner RM, Hutcheson SW (2003)
Genomic analysis and initial characterization of the chitinolytic system of
Microbulbifer degradans strain 2–40. J Bacteriol 185: 3352–3360.
21. Taylor LE 2nd, Henrissat B, Coutinho PM, Ekborg NA, Hutcheson SW, et al.
(2006) Complete cellulase system in the marine bacterium Saccharophagus
degradans strain 2–40T. J Bacteriol 188: 3849–3861.
22. Warnecke F, Luginbuhl P, Ivanova N, Ghassemian M, Richardson TH, et al.
(2007) Metagenomic and functional analysis of hindgut microbiota of a wood-
feeding higher termite. Nature 450: 560–565.
23. Ekborg NA, Morrill W, Burgoyne AM, Li L, Distel DL (2007) CelAB, a
multifunctional cellulase encoded by Teredinibacter turnerae T7902T, a culturable
symbiont isolated from the wood-boring marine bivalve Lyrodus pedicellatus. Appl
Environ Microbiol 73: 7785–7788.
24. Howard MB, Ekborg NA, Taylor LE 2nd, Weiner RM, Hutcheson SW (2004)
Chitinase B of ‘‘Microbulbifer degradans’’ 2–40 contains two catalytic domains
with different chitinolytic activities. J Bacteriol 186: 1297–1303.
25. Collin M, Fischetti VA (2004) A novel secreted endoglycosidase from
Enterococcus faecalis with activity on human immunoglobulin G and ribonuclease
B. J Biol Chem 279: 22558–22570.
26. Dekker RFH (1985) Biodegradation of the hemicelluloses. In: Higuchi T, ed.
Biosynthesis and Biodegradation of Wood Components. Orlando, FL:
Academic Press, Inc. pp 505–533.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org15 July 2009 | Volume 4 | Issue 7 | e6085
27. Kosugi A, Murashima K, Doi RH (2002) Xylanase and acetyl xylan esterase
activities of XynA, a key subunit of the Clostridium cellulovorans cellulosome for
xylan degradation. Appl Environ Microbiol 68: 6399–6402.
28. Xie G, Bruce DC, Challacombe JF, Chertkov O, Detter JC, et al. (2007)
Genome sequence of the cellulolytic gliding bacterium Cytophaga hutchinsonii.
Appl Environ Microbiol 73: 3536–3546.
29. Qi M, Jun HS, Forsberg CW (2007) Characterization and synergistic
interactions of Fibrobacter succinogenes glycoside hydrolases. Appl Environ
Microbiol 73: 6098–6105.
30. Berger E, Zhang D, Zverlov VV, Schwarz WH (2007) Two noncellulosomal
cellulases of Clostridium thermocellum, Cel9I and Cel48Y, hydrolyse crystalline
cellulose synergistically. FEMS Microbiol Lett 268: 194–201.
31. Poulsen OM, Petersen LW (1992) Degradation of microcrystalline cellulose:
Synergism between different endoglucanases of Cellulomonas sp. ATCC 21399.
Biotechnol Bioeng 39: 121–123.
32. Eriksson T, Karlsson J, Tjerneld F (2002) A model explaining declining rate in
hydrolysis of lignocellulose substrates with cellobiohydrolase I (Cel7A) and
endoglucanase I (Cel7B) of Trichoderma reesei. Appl Biochem Biotechnol 101:
33. Warren RA, Gerhard B, Gilkes NR, Owolabi JB, Kilburn DG, et al. (1987) A
bifunctional exoglucanase-endoglucanase fusion protein. Gene 61: 421–427.
34. Fierobe HP, Bayer EA, Tardif C, Czjzek M, Mechaly A, et al. (2002)
Degradation of cellulose substrates by cellulosome chimeras. Substrate
targeting versus proximity of enzyme components. J Biol Chem 277:
35. Fierobe HP, Mingardon F, Mechaly A, Belaich A, Rincon MT, et al. (2005)
Action of designer cellulosomes on homogeneous versus complex substrates:
controlled incorporation of three distinct enzymes into a defined trifunctional
scaffoldin. J Biol Chem 280: 16325–16334.
36. Distel DL (2003) The biology of marine wood boring bivalves and their
bacterial endosymbionts. In: Goodell B, Nicholas DD, Schultz TP, eds. Wood
Deterioration and Preservation. Washington: American Chemical Society
Press. pp 253–271.
37. Hall J, Hazlewood GP, Huskisson NS, Durrant AJ, Gilbert HJ (1989)
Conserved serine-rich sequences in xylanase and cellulase from Pseudomonas
fluorescens subspecies cellulosa: internal signal sequence and unusual protein
processing. Mol Microbiol 3: 1211–1219.
38. Howard MB, Ekborg NA, Taylor LE, Hutcheson SW, Weiner RM (2004)
Identification and analysis of polyserine linker domains in prokaryotic proteins
with emphasis on the marine bacterium Microbulbifer degradans. Protein Sci 13:
39. Lee VT, Schneewind O (2001) Protein secretion and the pathogenesis of
bacterial infections. Genes Dev 15: 1725–1752.
40. Ninio S, Roy CR (2007) Effector proteins translocated by Legionella pneumophila:
strength in numbers. Trends Microbiol 15: 372–380.
41. Ogawa M, Handa Y, Ashida H, Suzuki M, Sasakawa C (2008) The versatility
of Shigella effectors. Nat Rev Microbiol 6: 11–16.
42. Trosky JE, Liverman AD, Orth K (2008) Yersinia outer proteins: Yops. Cell
Microbiol 10: 557–565.
43. Mori H, Ito K (2001) The Sec protein-translocation pathway. Trends
Microbiol 9: 494–500.
44. Berks BC, Palmer T, Sargent F (2005) Protein targeting by the bacterial twin-
arginine translocation (Tat) pathway. Curr Opin Microbiol 8: 174–181.
45. Rapoport TA (2007) Protein translocation across the eukaryotic endoplasmic
reticulum and bacterial plasma membranes. Nature 450: 663–669.
46. Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved prediction
of signal peptides: SignalP 3.0. J Mol Biol 340: 783–795.
47. Dale C, Moran NA (2006) Molecular interactions between bacterial symbionts
and their hosts. Cell 126: 453–465.
48. Dale C, Young SA, Haydon DT, Welburn SC (2001) The insect endosymbiont
Sodalis glossinidius utilizes a type III secretion system for cell invasion. Proc
Natl Acad Sci U S A 98: 1883–1888.
49. Hubber A, Vergunst AC, Sullivan JT, Hooykaas PJ, Ronson CW (2004)
Symbiotic phenotypes and translocated effector proteins of the Mesorhizobium
loti strain R7A VirB/D4 type IV secretion system. Mol Microbiol 54: 561–574.
50. Aschtgen MS, Bernard CS, De Bentzmann S, Lloubes R, Cascales E (2008)
SciN is an outer membrane lipoprotein required for Type VI secretion in
enteroaggregative Escherichia coli. J Bacteriol.
51. Filloux A, Hachani A, Bleves S (2008) The bacterial type VI secretion machine:
yet another player for protein transport across membranes. Microbiology 154:
52. Enos-Berlage JL, Guvener ZT, Keenan CE, McCarter LL (2005) Genetic
determinants of biofilm development of opaque and translucent Vibrio
parahaemolyticus. Mol Microbiol 55: 1160–1182.
53. Parsons DA, Heffron F (2005) sciS, an icmF homolog in Salmonella enterica
serovar Typhimurium, limits intracellular replication and decreases virulence.
Infect Immun 73: 4338–4345.
54. Pilatz S, Breitbach K, Hein N, Fehlhaber B, Schulze J, et al. (2006)
Identification of Burkholderia pseudomallei genes required for the intracellular life
cycle and in vivo virulence. Infect Immun 74: 3576–3586.
55. Zheng J, Leung KY (2007) Dissection of a type VI secretion system in
Edwardsiella tarda. Mol Microbiol 66: 1192–1206.
56. Pukatzki S, Ma AT, Revel AT, Sturtevant D, Mekalanos JJ (2007) Type VI
secretion system translocates a phage tail spike-like protein into target cells
where it cross-links actin. Proc Natl Acad Sci U S A 104: 15508–15513.
57. Van Brussel AA, Zaat SA, Cremers HC, Wijffelman CA, Pees E, et al. (1986)
Role of plant root exudate and Sym plasmid-localized nodulation genes in the
synthesis by Rhizobium leguminosarum of Tsr factor, which causes thick and
short roots on common vetch. J Bacteriol 165: 517–522.
58. Bladergroen MR, Badelt K, Spaink HP (2003) Infection-blocking genes of a
symbiotic Rhizobium leguminosarum strain that are involved in temperature-
dependent protein secretion. Mol Plant Microbe Interact 16: 53–64.
59. Das S, Chaudhuri K (2003) Identification of a unique IAHP (IcmF associated
homologous proteins) cluster in Vibrio cholerae and other proteobacteria through
in silico analysis. In Silico Biol 3: 287–300.
60. Dudley EG, Thomson NR, Parkhill J, Morin NP, Nataro JP (2006) Proteomic
and microarray characterization of the AggR regulon identifies a pheU
pathogenicity island in enteroaggregative Escherichia coli. Mol Microbiol 61:
61. Mougous JD, Cuff ME, Raunser S, Shen A, Zhou M, et al. (2006) A virulence
locus of Pseudomonas aeruginosa encodes a protein secretion apparatus. Science
62. Mougous JD, Gifford CA, Ramsdell TL, Mekalanos JJ (2007) Threonine
phosphorylation post-translationally regulates protein secretion in Pseudomonas
aeruginosa. Nat Cell Biol 9: 797–803.
63. Pukatzki S, Ma AT, Sturtevant D, Krastins B, Sarracino D, et al. (2006)
Identification of a conserved bacterial protein secretion system in Vibrio cholerae
using the Dictyostelium host model system. Proc Natl Acad Sci U S A 103:
64. Juncker AS, Willenbrock H, Von Heijne G, Brunak S, Nielsen H, et al. (2003)
Prediction of lipoprotein signal peptides in Gram-negative bacteria. Protein Sci
65. Kechris KJ, Lin JC, Bickel PJ, Glazer AN (2006) Quantitative exploration of
the occurrence of lateral gene transfer by using nitrogen fixation genes as a case
study. Proc Natl Acad Sci U S A 103: 9584–9589.
66. Pace HC, Brenner C (2001) The nitrilase superfamily: classification, structure
and function. Genome Biol 2: REVIEWS0001.
67. Podar M, Eads JR, Richardson TH (2005) Evolution of a microbial nitrilase
gene family: a comparative and environmental genomics study. BMC Evol Biol
68. Kobayashi M, Shimizu S (2000) Nitrile hydrolases. Curr Opin Chem Biol 4:
69. Robertson DE, Chaplin JA, DeSantis G, Podar M, Madden M, et al. (2004)
Exploring nitrilase sequence space for enantioselective catalysis. Appl Environ
Microbiol 70: 2429–2436.
70. Burton EO, Read HW, Pellitteri MC, Hickey WJ (2005) Identification of acyl-
homoserine lactone signal molecules produced by Nitrosomonas europaea strain
Schmidt. Appl Environ Microbiol 71: 4906–4909.
71. Ahmer BM (2004) Cell-to-cell signalling in Escherichia coli and Salmonella
enterica. Mol Microbiol 52: 933–945.
72. Federle MJ, Bassler BL (2003) Interspecies communication in bacteria. J Clin
Invest 112: 1291–1299.
73. Teplitski M, Ahmer BM (2004) The control of secondary metabolism, motility,
and virulence by the two-component regulatory system BarA/SirA of
Salmonella and other g-proteobacteria. In: Pruss BM, ed. Global Regulatory
Networks in Enteric Bacteria. Kerala, India: Research Signpost.
74. Lapouge K, Schubert M, Allain FH, Haas D (2008) Gac/Rsm signal
transduction pathway of gamma-proteobacteria: from RNA recognition to
regulation of social behaviour. Mol Microbiol 67: 241–253.
75. Zuber S, Carruthers F, Keel C, Mattart A, Blumer C, et al. (2003) GacS sensor
domains pertinent to the regulation of exoproduct formation and to the
biocontrol potential of Pseudomonas fluorescens CHA0. Mol Plant Microbe
Interact 16: 634–644.
76. Bentley SD, Chater KF, Cerden ˜o-Ta ´rraga AM, Challis GL, Thomson NR, et
al. (2002) Complete genome sequence of the model actinomycete Streptomyces
coelicolor A3(2). Nature 417: 141–147.
77. Ikeda H, Ishikawa J, Hanamoto A, Shinose M, Kikuchi H, et al. (2003)
Complete genome sequence and comparative analysis of the industrial
microorganism Streptomyces avermitilis. Nat Biotechnol 21: 526–531.
78. Walsh CT, Liu J, Rusnak F, Sakaitani M (1990) Molecular studies on enzymes
in chorismate metabolism and the enterobactin biosynthetic pathway. Chem
Rev 90: 1105–1129.
79. Chen XH, Koumoutsi A, Scholz R, Eisenreich A, Schneider K, et al. (2007)
Comparative analysis of the complete genome sequence of the plant growth
promoting bacterium Bacillus amyloliquefaciens FZB42. Nature biotechnology 25:
80. Bolotin A, Quinquis B, Sorokin A, Ehrlich SD (2005) Clustered regularly
interspaced short palindrome repeats (CRISPRs) have spacers of extrachro-
mosomal origin. Microbiology 151: 2551–2561.
81. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, et al. (2007)
CRISPR provides acquired resistance against viruses in prokaryotes. Science
82. Haft DH, Selengut J, Mongodin EF, Nelson KE (2005) A guild of 45 CRISPR-
associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in
prokaryotic genomes. PLoS Comput Biol 1: e60.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org 16 July 2009 | Volume 4 | Issue 7 | e6085
83. Kunin V, Sorek R, Hugenholtz P (2007) Evolutionary conservation of sequence Download full-text
and secondary structures in CRISPR repeats. Genome Biol 8: R61.
84. Moran NA (2002) Microbial minimalism: genome reduction in bacterial
pathogens. Cell 108: 583–586.
85. Bordenstein SR, Reznikoff WS (2005) Mobile DNA in obligate intracellular
bacteria. Nat Rev Microbiol 3: 688–699.
86. Plague GR, Dunbar HE, Tran PL, Moran NA (2008) Extensive proliferation of
transposable elements in heritable bacterial symbionts. J Bacteriol 190:
87. Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-
terminating inhibitors. Proc Natl Acad Sci U S A 74: 5463–5467.
88. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, et al. (2005)
Genome sequencing in microfabricated high-density picolitre reactors. Nature
89. Goldberg SM, Johnson J, Busam D, Feldblyum T, Ferriera S, et al. (2006) A
Sanger/pyrosequencing hybrid approach for the generation of high-quality
draft assemblies of marine microbial genomes. Proc Natl Acad Sci U S A 103:
90. Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, et al. (2000) A
whole-genome assembly of Drosophila. Science 287: 2196–2204.
91. Delcher AL, Bratke KA, Powers EC, Salzberg SL (2007) Identifying bacterial
genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673–679.
92. Delcher AL, Harmon D, Kasif S, White O, Salzberg SL (1999) Improved
microbial gene identification with GLIMMER. Nucleic Acids Res 27:
93. Bramhill D, Kornberg A (1988) Duplex opening by dnaA protein at novel
sequences in initiation of replication at the origin of the E. coli chromosome.
Cell 52: 743–755.
94. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local
alignment search tool. J Mol Biol 215: 403–410.
95. Sammut SJ, Finn RD, Bateman A (2008) Pfam 10 years on: 10,000 families and
still growing. Brief Bioinform 9: 210–219.
96. Selengut JD, Haft DH, Davidsen T, Ganapathy A, Gwinn-Giglio M, et al.
(2007) TIGRFAMs and Genome Properties: tools for the assignment of
molecular function and biological process in prokaryotic genomes. Nucleic
Acids Res 35: D260–264.
97. Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14: 755–763.
98. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS (1999) Probability-based
protein identification by searching sequence databases using mass spectrometry
data. Electrophoresis 20: 3551–3567.
99. Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to
estimate large phylogenies by maximum likelihood. Syst Biol 52: 696–704.
100. Gutell RR, Larsen N, Woese CR (1994) Lessons from an evolving rRNA: 16S
and 23S rRNA structures from a comparative perspective. Microbiol Rev 58:
101. Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of
phylogenetic trees. Bioinformatics 17: 754–755.
102. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, et al. (2004)
Versatile and open software for comparing large genomes. Genome Biol 5:
103. Shalom G, Shaw JG, Thomas MS (2007) In vivo expression technology
identifies a type VI secretion system locus in Burkholderia pseudomallei that is
induced upon invasion of macrophages. Microbiology 153: 2689–2699.
104. Bingle LE, Bailey CM, Pallen MJ (2008) Type VI secretion: a beginner’s guide.
Curr Opin Microbiol 11: 3–8.
Complete Genome of T. turnerae
PLoS ONE | www.plosone.org17 July 2009 | Volume 4 | Issue 7 | e6085