Genome sequence of the moderately thermophilic halophile Flexistipes sinusarabici strain (MAS10T)

Standards in Genomic Sciences (Impact Factor: 3.17). 10/2011; 5(1):86-96. DOI: 10.4056/sigs.2235024
Source: PubMed


Flexistipes sinusarabici Fiala et al. 2000 is the type species of the genus Flexistipes in the family Deferribacteraceae. The species is of interest because of its isolated phylogenetic location in a genomically under-characterized region of the tree of life, and because of its origin from a multiply extreme environment; the Atlantis Deep brines of the Red Sea, where it had to struggle with high temperatures, high salinity, and a high concentrations of heavy metals. This is the fourth completed genome sequence to be published of a type strain of the family Deferribacteraceae. The 2,526,590 bp long genome with its 2,346 protein-coding and 53 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.


Available from: Ioanna Pagani
Standards in Genomic Sciences (2011) 5:86-96 DOI:10.4056/sigs.2235024
The Genomic Standards Consortium
Genome sequence of the moderately thermophilic
halophile Flexistipes sinusarabici strain (MAS10
Alla Lapidus
, Olga Chertkov
, Matt Nolan
, Susan Lucas
, Nancy Hammon
, Shweta
, Jan-Fang Cheng
, Roxanne Tapia
, Cliff Han
, Lynne Goodwin
, Sam
, Konstantinos Liolios
, Ioanna Pagani
, Natalia Ivanova
, Marcel Huntemann
Konstantinos Mavromatis
, Natalia Mikhailova
, Amrita Pati
, Amy Chen
, Krishna
, Miriam Land
, Loren Hauser
, Evelyne-Marie Brambilla
, Manfred Rohde
Birte Abt
, Stefan Spring
, Markus Göker
, James Bristow
, Jonathan A. Eisen
, Victor
, Philip Hugenholtz
, Nikos C. Kyrpides
, Hans-Peter Klenk
*, and Tanja
DOE Joint Genome Institute, Walnut Creek, California, USA
Los Alamos National Laboratory, Bioscience Division, Los Alamos, New Mexico, USA
Biological Data Management and Technology Center, Lawrence Berkeley National
Laboratory, Berkeley, California, USA
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig,
HZI Helmholtz Centre for Infection Research, Braunschweig, Germany
University of California Davis Genome Center, Davis, California, USA
Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The
University of Queensland, Brisbane, Australia
*Corresponding author: Hans-Peter Klenk
Keywords: strictly anaerobic, Gram-negative, non-motile, heterotrophic, moderately thermo-
philic, marine, brine, Deferribacteraceae, GEBA
Flexistipes sinusarabici Fiala et al. 2000 is the type species of the genus Flexistipes in the fami-
ly Deferribacteraceae. The species is of interest because of its isolated phylogenetic location
in a genomically under-characterized region of the tree of life, and because of its origin from
a multiply extreme environment; the Atlantis Deep brines of the Red Sea, where it had to
struggle with high temperatures, high salinity, and a high concentrations of heavy metals.
This is the fourth completed genome sequence to be published of a type strain of the family
Deferribacteraceae. The 2,526,590 bp long genome with its 2,346 protein-coding and 53
RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
Strain MAS10
(= DSM 4947 = ATCC 49648) is the
type strain of Flexistipes sinusarabici [1,2] which is
the type and only species of the genus Flexistipes
[1,2]. The strain was first isolated from the Atlan-
tis II Deep brines of the Red Sea [1], together with
four related isolates. The generic name derives
from the Latin words flexus, a bending, turning,
winding, and stipes, a branch of tree, stick [1]. The
species epithet is derived from the Latin words
sinus, a curve or fold in land, a gulf, and arabicus,
Arabic, describing the place of isolation [1]. Since
the time of its isolation in the late 1980s until now
no closely related bacterium (16S rRNA identity
>90%) was described. The resistance of the strain
to moderate heat, high salt concentrations, and
heavy metals [1] should make it an interesting
target for extremophile biotechnology. Here we
present a summary classification and a set of fea-
tures for F. sinusarabici MAS10
, together with the
description of the complete genomic sequencing
and annotation.
Page 1
Lapidus et al. 87
Figure 1. Phylogenetic tree highlighting the position of F. sinusarabici relative to the type strains of the other
species within the phylum "Deferribacteres". The tree was inferred from 1,459 aligned characters [7,8] of the
16S rRNA gene sequence under the maximum likelihood (ML) criterion [9]. Rooting was done initially using
the midpoint method [10] and then checked for its agreement with the current classification (Table 1). The
branches are scaled in terms of the expected number of substitutions per site. Numbers adjacent to the
branches are support values from 250 ML bootstrap replicates [11] (left) and from 1,000 maximum parsimony
bootstrap replicates [12] (right) if larger than 60%. Lineages with type strain genome sequencing projects regis-
tered in GOLD [13] are labeled with one asterisk, those also listed as 'Complete and Published' with two aste-
risks [14-16].
Figure 2. Scanning electron micrograph of F. sinusarabici MAS10
Page 2
Flexistipes sinusarabici strain (MAS10T)
88 Standards in Genomic Sciences
Classification and features
A representative genomic 16S rRNA sequence of
strain MAS10
was compared using NCBI BLAST
[3,4] under default settings (e.g., considering only
the high-scoring segment pairs (HSPs) from the
best 250 hits) with the most recent release of the
Greengenes database [5] and the relative frequen-
cies of taxa and keywords (reduced to their stem
[6]) were determined, weighted by BLAST scores.
The most frequently occurring genera were Acidi-
thiobacillus (60.0%), Deferribacter (26.8%), Flexis-
tipes (8.2%), Desulfuromonas (2.2%) and Calditer-
rivibrio (1.8%) (80 hits in total). Regarding the sin-
gle hit to sequences from members of the species,
the average identity within HSPs was 98.0%, whe-
reas the average coverage by HSPs was 96.9%.
Among all other species, the one yielding the high-
est score was Deferribacter abyssi (AJ515881),
which corresponded to an identity of 89.7% and an
HSP coverage of 86.4%. (Note that the Greengenes
database uses the INSDC (= EMBL/NCBI/DDBJ)
annotation, which is not an authoritative source for
nomenclature or classification.) The highest-
scoring environmental sequence was FR744611
('succession potential reducers nitrate-treated fa-
cility determined temperature and nitrate availabil-
ity production water Halfdan oil field clone
PWB039'), which showed an identity of 96.7% and
an HSP coverage of 93.1%. The most frequently
occurring keywords within the labels of all envi-
ronmental samples which yielded hits were
'microbi' (3.9%), 'acid' (3.4%), 'sediment' (3.3%),
'water' (3.0%) and 'oil' (2.4%) (170 hits in total).
The most frequently occurring keyword within the
labels of those environmental samples which
yielded hits of a higher score than the highest scor-
ing species was 'avail, determin, facil, field, halfdan,
nitrat, nitrate-tr, oil, potenti, product, reduc, suc-
cess, temperatur, water' (7.1%) (1 hit in total).
While these keywords fit to the marine environ-
ment from which strain MAS10
originated, they
also point to sediments and oil fields which were so
far not considered as habitats for F. sinusarabici.
Figure 1 shows the phylogenetic neighborhood of F.
sinusarabici MAS10
in a 16S rRNA based tree. The
sequences of the two identical 16S rRNA gene cop-
ies in the genome differ by two nucleotides from
the previously published 16S rRNA sequence
M59231, which contains 25 ambiguous base calls.
Cells of strain MAS10
are straight to bent rods,
about 0.3 μm wide and 450 μm long (Figure 2) [1].
F. sinusarabici was described as non-motile [1].
Spore-formation was not observed [1]. MAS10
cells stain Gram-negative, and growth is strictly
anaerobic, with the best growth occurring within a
temperature range of 4550°C and a minimum
doubling time of 8 ½ hours [1]. Optimal pH range
for the strain is pH 6-8 [1]. Strain MAS10
at least 3% NaCl for growth, but also grows at salt
concentrations as high as 10% [1]. The organism
prefers complex growth substrates such as yeast
extract, meat extract, peptone and tryptone, while
formate, lactate, citrate, malate, carbohydrate, ami-
no acids and alcohols do not support cell growth
[1]. Strain MAS10
shows an unusual resistance
against the transcription inhibitor rifampicin [1],
which is however also commonly found among the
The chemotaxonomic data for MAS10
is relatively
sparse: No information on cell wall structure, qui-
nones or polar lipids is available. The fatty acid
composition is dominated by saturated unbranched
acids: C
(23.3%), C
(15.1%), C
(12.6%), with
some branched acids iso-C
(10.2%), anteiso-C
(10.2%), iso-C
(4.1%), iso-C
(3.6%), and few un-
saturated acids C
(9.9%), C
(2.8%), C
(2.5%) [1].
Genome sequencing and annotation
Genome project history
This organism was selected for sequencing on the
basis of its phylogenetic position [28], and is part
of the Genomic Encyclopedia of Bacteria and Arc-
haea project [29]. The genome project is depo-
sited in the Genome On Line Database [13] and the
complete genome sequence is deposited in Gen-
Bank. Sequencing, finishing and annotation were
performed by the DOE Joint Genome Institute
(JGI). A summary of the project information is
shown in Table 2.
Growth conditions and DNA isolation
F. sinusarabici MAS10
, DSM 4947, was grown
anaerobically in DSMZ medium 524 (Flexistipes
Medium) [30] at 47°C. DNA was isolated from 0.5-1
g of cell paste using Jetflex Genomic DNA Purifica-
tion Kit (GENOMED 600100) following the stan-
dard protocol as recommended by the manufactur-
er, but adding 10µl proteinase K for one hour ex-
tended lysis at 58°C. DNA is available through the
DNA Bank Network [31].
Page 3
Lapidus et al. 89
Table 1. Classification and general features of F. sinusarabici MAS10
according to the MIGS recommendations [17]
and the NamesforLife database [18].
Evidence code
Current classification
Domain Bacteria TAS [19]
Phylum “Deferribacteres
TAS [20,21]
Class “Deferribacteres
TAS [22,23]
Order Deferribacterales
TAS [22,24]
Family Deferribacteraceae
TAS [22,25]
Genus Flexistipes
TAS [1,2]
Species Flexistipes sinusarabici
TAS [1,2]
Type strain MAS10
TAS [1]
Gram stain negative TAS [1]
Cell shape straight to acutely bent rods TAS [1]
Motility non-motile TAS [1]
TAS [1]
Temperature range 3053°C, moderately thermophilic TAS [1]
Optimum temperature 4550°C TAS [1]
Salinity at least 3% NaCl, growths with up to 18% NaCl TAS [1]
Oxygen requirement strictly anaerobic TAS [1]
Carbon source
complex organic components like yeast extract, meat
extract, peptone, tryptone
TAS [1]
Energy metabolism heterotrophic TAS [1]
marine, deep brine water
TAS [1]
Biotic relationship free-living TAS [1]
Pathogenicity none TAS [1]
Biosafety level 1 TAS [26]
Isolation interface between upper brine layer and deep sea water TAS [1]
Geographic location Atlantis II Deep brines, Red Sea TAS [1]
Sample collection time
1987 or before
Latitude 21.37 TAS [1]
Longitude 38.07 TAS [1]
Depth 2,000 2,200 m TAS [1]
Altitude -2,200 2,200 m TAS [1]
Evidence codes - TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable
Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted prop-
erty for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [27].
Page 4
Flexistipes sinusarabici strain (MAS10T)
90 Standards in Genomic Sciences
Table 2. Genome sequencing project information
MIGS ID Property Term
MIGS-31 Finishing quality Finished
MIGS-28 Libraries used
Four genomic libraries: one 454 pyrosequence standard library, two 454
PE libraries (3 kb, 15.5 kb insert size), one Illumina library
MIGS-29 Sequencing platforms Illumina GAii, 454 GS FLX Titanium
MIGS-31.2 Sequencing coverage 162.0 × Illumina; 37.9 × pyrosequence
MIGS-30 Assemblers Newbler version 2.3, Velvet version 0.7.63, phrap SPS-4.24
MIGS-32 Gene calling method Prodigal 1.4, GenePRIMP
Genbank Date of Release June 17, 2011
GOLD ID Gc01819
NCBI project ID 45817
Database: IMG-GEBA 2505679008
MIGS-13 Source material identifier DSM 4947
Project relevance Tree of Life, GEBA
Genome sequencing and assembly
The genome was sequenced using a combination
of Illumina and 454 sequencing platforms. All
general aspects of library construction and se-
quencing can be found at the JGI website [32]. Py-
rosequencing reads were assembled using the
Newbler assembler (Roche). The initial Newbler
assembly consisting of 175 contigs in two scaf-
folds was converted into a phrap [33] assembly by
making fake reads from the consensus, to collect
the read pairs in the 454 paired end library. Illu-
mina GAii sequencing data (489.7 Mb) was as-
sembled with Velvet [34] and the consensus se-
quences were shredded into 2.0 kb overlapped
fake reads and assembled together with the 454
data. The 454 draft assembly was based on 170.4
Mb 454 draft data and all of the 454 paired end
data. Newbler parameters are -consed -a 50 -l 350
-g -m -ml 20. The Phred/Phrap/Consed software
package [33] was used for sequence assembly and
quality assessment in the subsequent finishing
process. After the shotgun stage, reads were as-
sembled with parallel phrap (High Performance
Software, LLC). Possible mis-assemblies were cor-
rected with gapResolution [32], Dupfinisher [35],
or sequencing cloned bridging PCR fragments with
subcloning. Gaps between contigs were closed by
editing in Consed, by PCR and by Bubble PCR pri-
mer walks (J.-F. Chang, unpublished). A total of
605 additional reactions and 15 shatter libraries
were necessary to close gaps and to raise the qual-
ity of the finished sequence. Illumina reads were
also used to correct potential base errors and in-
crease consensus quality using a software Polisher
developed at JGI [36]. The error rate of the com-
pleted genome sequence is less than 1 in 100,000.
Together, the combination of the Illumina and 454
sequencing platforms provided 199.9 × coverage
of the genome. The final assembly contained
248,918 pyrosequence and 395,536,860 Illumina
Genome annotation
Genes were identified using Prodigal [37] as part
of the Oak Ridge National Laboratory genome an-
notation pipeline, followed by a round of manual
curation using the JGI GenePRIMP pipeline [38].
The predicted CDSs were translated and used to
search the National Center for Biotechnology In-
formation (NCBI) non-redundant database, Uni-
Prot, TIGR-Fam, Pfam, PRIAM, KEGG, COG, and In-
terPro databases. Additional gene prediction anal-
ysis and functional annotation was performed
within the Integrated Microbial Genomes - Expert
Review (IMG-ER) platform [39].
Genome properties
The genome consists of a 2,526,590 bp long circu-
lar chromosome with a G+C content of 38.3% (Ta-
ble 3 and Figure 3). Of the 2,399 genes predicted,
2,346 were protein-coding genes, and 53 RNAs; 85
pseudogenes were also identified. The majority of
the protein-coding genes (75.2%) were assigned a
putative function while the remaining ones were
annotated as hypothetical proteins. The distribu-
tion of genes into COGs functional categories is
presented in Table 4.
Page 5
Lapidus et al. 91
Table 3. Genome Statistics
Attribute Value % of Total
Genome size (bp) 2,526,590 100.00%
DNA coding region (bp) 2,179,830 86.28%
DNA G+C content (bp) 967,539 38.29%
Number of replicons 1
Extrachromosomal Elements 0
Total genes 2,399 100.00%
RNA genes 53
rRNA operons 2
Protein-coding genes 2,346 97.79%
Pseudo genes 85 3.54%
Genes with function prediction 1,803 75.16%
Genes in paralog clusters 242 10.09%
Genes assigned to COGs 1,924 80.20%
Genes assigned Pfam domains 1,978 82.45%
Genes with signal peptides 366 15.26%
Genes with transmembrane helices 579 24.14%
CRISPR repeats 0
Insight into the genome sequence
Comparative genomics
Lacking an available genome sequence of Deferri-
bacter abyssi, the species yielding the highest
score, the following comparative analyses were
done with D. desulfuricans [14] (GenBank
AP011529, AP011530) and Calditerrivibrio nitro-
reducens (GenBank CP002347, CP002348) [16],
the phylogenetically closest organisms for which a
genome sequence was available. The genomes of
F. sinusarabici, D. desulfuricans and C. nitroredu-
cens are similar in sizes (2.5 Mb, 2.5 Mb and 2.2
Mb, respectively) and have a similar, quite low
G+C content (38%, 30% and 36%, respectively).
Whereas F. sinusarabici has no plasmid, D. desulfu-
ricans harbors a 5.9 kb plasmid; C. nitroreducens
contains a 30.8 kb megaplasmid.
An estimate of the overall similarity between the
three genomes was generated with the GGDC-
Genome-to-Genome Distance Calculator [40,41].
This system calculates the distances by comparing
the genomes to obtain HSPs (high-scoring seg-
ment pairs) and inferring distances from a set of
formulas (1, HSP length / total length; 2, identities
/ HSP length; 3, identities / total length). Table 5
shows the results of the pairwise comparison be-
tween the three genomes.
The comparison of the F. sinusarabici and D. desul-
furicans genomes revealed that 5.9% of the aver-
age of both genome lengths are covered with
HSPs. The identity within these HSPs was 83.2%,
whereas the identity over the whole genome was
only 4.9%. Similar results were inferred for F. si-
nusarabici and C. nitroreducens (Table 5). The ge-
nomes of D. desulfuricans and C. nitroreducens
show a significantly higher degree of similarity
with 9.9% of the average of both genomes are
covered with HSPs of 83.3% identity. The identity
over the whole length of the genomes was 8.3%.
These values corroborate the relationship be-
tween the three organisms as shown in the 16S
rRNA-based phylogenetic tree in Figure 1, as there
is no bootstrap support that F. sinusarabici is clos-
er related to either C. nitroreducens or D. desulfuri-
Page 6
Flexistipes sinusarabici strain (MAS10T)
92 Standards in Genomic Sciences
Figure 3. Graphical circular map of the chromosome. From outside to the center: Genes on forward strand (color
by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red,
other RNAs black), GC content, GC skew.
The fraction of shared genes in the three genomes
is shown in a Venn diagram (Figure 4). The num-
bers of pairwise shared genes were calculated
with the phylogenetic profiler function of the
IMG/ER platform [33]. The homologous genes
within the genomes were detected with a maxi-
mum E-value of 10
and a minimum identity of
30%. Roughly 61% of all genes in the genomes
(1,400 genes) are shared by all three genomes,
with about equal numbers of genes (224 and 246)
shared on a pairwise basis by F. sinusarabici and
D. desulfuricans or by D. desulfuricans and C. nitro-
reducens, respectively, and to the exclusion of the
third genome. Within the 567 unique genes of F.
sinusarabici that have no detectable homologs in
the genomes of D. desulfuricans and C. nitroredu-
cens (under the sequence similarity thresholds
used for the comparison) the 86 genes (3.7%
based on the whole gene number) encoding
transposases appear to be noteworthy.
Page 7
Lapidus et al. 93
Table 4. Number of genes associated with the general COG functional categories
Code value %age Description
J 145 7.0 Translation, ribosomal structure and biogenesis
A 1 0.1 RNA processing and modification
K 84 4.0 Transcription
L 205 9.8 Replication, recombination and repair
B 2 0.1 Chromatin structure and dynamics
D 21 1.0 Cell cycle control, cell division, chromosome partitioning
Y 0 0.0 Nuclear structure
V 26 1.3 Defense mechanisms
T 115 5.5 Signal transduction mechanisms
M 135 6.5 Cell wall/membrane/envelope biogenesis
N 36 1.7 Cell motility
Z 0 0.0 Cytoskeleton
W 0 0.0 Extracellular structures
U 62 3.0 Intracellular trafficking, secretion, and vesicular transport
O 81 3.9 Posttranslational modification, protein turnover, chaperones
C 174 8.3 Energy production and conversion
G 66 3.2 Carbohydrate transport and metabolism
E 198 9.5 Amino acid transport and metabolism
F 52 2.5 Nucleotide transport and metabolism
H 115 5.5 Coenzyme transport and metabolism
I 60 2.9 Lipid transport and metabolism
P 86 4.1 Inorganic ion transport and metabolism
Q 32 1.5 Secondary metabolites biosynthesis, transport and catabolism
R 246 11.8 General function prediction only
S 145 7.0 Function unknown
- 475 19.8
Not in COGs
Table 5. Pairwise comparison of F. sinusarabici, D. desulfuricans and C. nitroreducens using the GGDC-Calculator.
1, HSP length
/total length [%]
2, identities /HSP
length [%]
3, identities
/total length [%]
F. sinusarabici D. desulfuricans 5.9 83.2 4.9
F. sinusarabici C. nitroreducens 5.1 83.3 4.3
D. desulfuricans C. nitroreducens 9.9 83.3 8.3
Page 8
Flexistipes sinusarabici strain (MAS10T)
94 Standards in Genomic Sciences
A remarkable difference between the compared
organisms is their motility. Whereas F. sinusarabi-
ci is described to be non-motile, D. desulfuricans is
motile by twitching [14] and C. nitroreducens is
also described to be motile [16]. The mechanism
of twitching motility is still unknown but it is
thought that moving across surfaces is caused by
extension and retraction of type IV pili. A set of
genes that is responsible for twitching motility
was identified in several organisms; in Pseudomo-
nas aeruginosa a gene cluster involved in pilus
biosynthesis and twitching motility was characte-
rized, the gene products of this gene cluster show
a high degree of sequence similarity to the chemo-
taxis (che) proteins of enterics and the gliding
bacterium Myxococcus xanthus [42]. A closer look
into the genome sequences of F. sinusarabici, D.
desulfuricans and C. nitroreducens revealed the
presence of different gene sets coding for chemo-
taxis proteins. In contrast to D. desulfuricans and C.
nitroreducens, F. sinusarabici lacks four che genes
(cheB, cheR, cheV, cheW). In P. aeruginosa a muta-
tion in the pilI gene, a homolog to cheW, lead to a
blocking of pilus production [42]. It can be as-
sumed that the missing cheW gene in F. sinusara-
bici might be responsible for the non-motility of
the cells, despite the rather large number of 36
genes annotated in the cell motility category of
table 4.
Figure 4. Venn diagram depicting the intersections of protein sets (total number of derived protein
sequences in parentheses) of F. sinusarabici, D. desulfuricans and C. nitroreducens.
We would like to gratefully acknowledge the help of
Maren Schröder (DSMZ) for growing F. sinusarabici
cultures. This work was performed under the auspices
of the US Department of Energy's Office of Science, Bio-
logical and Environmental Research Program, and by
the University of California, Lawrence Berkeley Nation-
al Laboratory under contract No. DE-AC02-05CH11231,
Lawrence Livermore National Laboratory under Con-
tract No. DE-AC52-07NA27344, and Los Alamos Na-
tional Laboratory under contract No. DE-AC02-
06NA25396, UT-Battelle and Oak Ridge National La-
boratory under contract DE-AC05-00OR22725, as well
as German Research Foundation (DFG) INST 599/1-2.
Page 9
Lapidus et al. 95
1. Fiala G, Woese CR, Langworthy TA, Stetter KO.
Flexistipes sinusarabici a novel genus and species
of eubacteria occurring in the Atlantis II Deep
brines of the Red Sea. Arch Microbiol 1990;
2. Validation List No. 75. Int J Syst Evol Microbiol
2000; 50:1415-1417.
3. Altschul SF, Gish W, Miller W, Myers EW, Lip-
man DJ. Bascic local alignment search tool. J Mol
Biol 1990; 215:403-410.
4. Korf I, Yandell M, Bedell J. BLAST, O'Reilly,
Sebastopol, 2003.
5. DeSantis TZ, Hugenholtz P, Larsen N, Rojas M,
Brodie EL, Keller K, Huber T, Dalevi D, Hu P,
Andersen GL. Greengenes, a chimera-checked
16S rRNA gene database and workbench compat-
ible with ARB. Appl Environ Microbiol 2006;
6. Porter MF. An algorithm for suffix stripping. Pro-
gram: electronic library and information systems
1980; 14:130-137.
7. Castresana J. Selection of conserved blocks from
multiple alignments for their use in phylogenetic
analysis. Mol Biol Evol 2000; 17:540-552.
8. Lee C, Grasso C, Sharlow MF. Multiple sequence
alignment using partial order graphs. Bioinformat-
ics 2002; 18:452-464.
9. Stamatakis A, Hoover P, Rougemont J. A rapid
bootstrap algorithm for the RAxML Web servers.
Syst Biol 2008; 57:758-771.
10. Hess PN, De Moraes Russo CA. An empirical test
of the midpoint rooting method. Biol J Linn Soc
Lond 2007; 92:669-674.
11. Pattengale ND, Alipour M, Bininda-Emonds ORP,
Moret BME, Stamatakis A. How many bootstrap
replicates are necessary? Lect Notes Comput Sci
2009; 5541:184-200.
12. Swofford DL. PAUP*: Phylogenetic Analysis Us-
ing Parsimony (*and Other Methods), Version 4.0
b10. Sinauer Associates, Sunderland, 2002.
13. Liolios K, Chen IM, Mavromatis K, Tavernarakis
N, Hugenholtz P, Markowitz VM, Kyrpides NC.
The Genomes On Line Database (GOLD) in
2009: status of genomic and metagenomic
projects and their associated metadata. Nucleic
Acids Res 2010; 38:D346-D354.
14. Takaki Y, Shimamura S, Nakagawa S, Fukuhara Y,
Horikawa H, Ankai A, Harada T, Hosoyama A,
Oguchi A, Fukui S, et al. Bacterial lifestyle in a
deep-sea hydrothermal vent chimney revealed by
the genome sequence of the thermophilic bacte-
rium Deferribacter desulfuricans SSM1. DNA Res
2010; 17:123-137.
15. Kiss H, Lang E, Lapidus A, Copeland A, Nolan M,
Glavina Del Rio T, Chen F, Lucas S, Tice H,
Cheng JF, et al. Complete genome sequence of
Denitrovibrio acetiphilus tye strain (N2460
Stand Genomic Sci 2010; 2:270-279.
16. Pitluck S, Sikorski J, Zeytun A, Lapidus A, Nolan
M, Lucas S, Hammon N, Deshpande S, Cheng JF,
Tapia R, et al. Complete genome sequence of
Calditerrivibrio nitroreducens type strain (Yu37-
). Stand Genomic Sci 2011; 4:54-62. PubMed
17. Field D, Garrity G, Gray T, Morrison N, Selengut
J, Sterk P, Tatusova T, Thomson N, Allen MJ, An-
giuoli SV, et al. The minimum information about
a genome sequence (MIGS) specification. Nat
Biotechnol 2008; 26:541-547.
18. Garrity G. NamesforLife. BrowserTool takes ex-
pertise out of the database and puts it right in the
browser. Microbiol Today 2010; 37:9.
19. Woese CR, Kandler O, Wheelis ML. Towards a
natural system of organisms: proposal for the do-
mains Archaea, Bacteria, and Eucarya. Proc Natl
Acad Sci USA 1990; 87:4576-4579.
20. Garrity GM, Holt JG. Phylum BIX. Deferribacteres
phy. nov. In: Garrity GM, Boone DR, Castenholz
RW (eds), Bergey's Manual of Systematic Bacteri-
ology, Second Edition, Volume 1, Springer, New
York, 2001, p. 465.
21. Jumas-Bilak E, Roudière L, Marchandin H. De-
scription of 'Synergistetes' phyl. nov. and
emended description of the phylum
'Deferribacteres' and of the family Syntrophomo-
nadaceae, phylum 'Firmicutes'. Int J Syst Evol Mi-
crobiol 2009; 59:1028-1035.
Page 10
Flexistipes sinusarabici strain (MAS10T)
96 Standards in Genomic Sciences
22. List Editor. Validation List no. 85. Validation of
publication of new names and new combinations
previously effectively published outside the IJ-
SEM. Int J Syst Evol Microbiol 2002; 52:685-690.
23. Huber H, Stetter KO. Class I. Deferribacteres
class. nov. In: Garrity GM, Boone DR, Castenholz
RW (eds), Bergey's Manual of Systematic Bacteri-
ology, Second Edition, Volume 1, Springer, New
York, 2001, p. 465.
24. Huber H, Stetter KO. Order I. Deferribacterales
ord. nov. In: Garrity GM, Boone DR, Castenholz
RW (eds), Bergey's Manual of Systematic Bacteri-
ology, Second Edition, Volume 1, Springer, New
York, 2001, p. 465.
25. Huber H, Stetter KO. Family I. Deferribacteraceae
fam. nov. In: Garrity GM, Boone DR, Castenholz
RW (eds), Bergey's Manual of Systematic Bacteri-
ology, Second Edition, Volume 1, Springer, New
York, 2001, p. 465-466.
26. BAuA. Classification of bacteria and archaea in
risk groups. TRBA 2005; 466:190.
27. Ashburner M, Ball CA, Blake JA, Botstein D, But-
ler H, Cherry JM, Davis AP, Dolinski K, Dwight
SS, Eppig JT, et al. Gene Ontology: tool for the
unification of biology. Nat Genet 2000; 25:25-29.
28. Klenk HP, Göker M. En route to a genome-based
classification of Archaea and Bacteria? Syst Appl
Microbiol 2010; 33:175-182.
29. Wu D, Hugenholtz P, Mavromatis K, Pukall R,
Dalin E, Ivanova NN, Kunin V, Goodwin L, Wu
M, Tindall BJ, et al. A phylogeny-driven genomic
encyclopaedia of Bacteria and Archaea. Nature
2009; 462:1056-1060.
30. List of growth media used at DSMZ:
31. Gemeinholzer B, Dröge G, Zetzsche H, Haszpru-
nar G, Klenk HP, Güntsch A, Berendsohn WG,
Wägele JW. The DNA Bank Network: the start
from a German initiative. Biopreservation and
Biobanking 2011; 9:51-55.
32. The DOE Joint Genome Institute.
33. Phrap and Phred for Windows. MacOS, Linux,
and Unix.
34. Zerbino DR, Birney E. Velvet: algorithms for de
novo short read assembly using de Bruijn graphs.
Genome Res 2008; 18:821-829.
35. Han C, Chain P. Finishing repeat regions auto-
matically with Dupfinisher. In: Proceeding of the
2006 international conference on bioinformatics
& computational biology. Arabnia HR, Valafar H
(eds), CSREA Press. June 26-29, 2006: 141-146.
36. Lapidus A, LaButti K, Foster B, Lowry S, Trong S,
Goltsman E. POLISHER: An effective tool for us-
ing ultra short reads in microbial genome assem-
bly and finishing. AGBT, Marco Island, FL, 2008.
37. Hyatt D, Chen GL, LoCascio PF, Land ML,
Larimer FW, Hauser LJ. Prodigal: prokaryotic
gene recognition and translation initiation site
identification. BMC Bioinformatics 2010; 11:119.
38. Pati A, Ivanova NN, Mikhailova N, Ovchinnikova
G, Hooper SD, Lykidis A, Kyrpides NC. Gene-
PRIMP: a gene prediction improvement pipeline
for prokaryotic genomes. Nat Methods 2010;
PubMed doi:10.1038/nmeth.1457
39. Markowitz VM, Ivanova NN, Chen IMA, Chu K,
Kyrpides NC. IMG ER: a system for microbial ge-
nome annotation expert review and curation. Bio-
informatics 2009; 25:2271-2278.
40. Auch AF, von Jan M, Klenk HP, Göker M. Digital
DNA-DNA hybridization for microbial species
delineation by means of genome-to-genome se-
quence comparison. Stand Genomic Sci 2010;
PubMed doi:10.4056/sigs.531120
41. Auch AF, Klenk HP, Göker M. Standard operating
procedure for calculating genome-to-genome dis-
tances based on high-scoring segment pairs.
Stand Genomic Sci 2010; 2:142-148.
42. Darzins A. Characterization of a Pseudomonas
aeruginosa gene cluster involved in pilus biosyn-
thesis and twitching motility: sequence similarity
to the chemotaxis proteins of enterics and the
gliding bacterium Myxococcus xanthus. Mol Mi-
crobiol 1994; 11:137-153.
Page 11
  • Source
    • "The genomes of two anaerobic fermentative halophiles belonging to other phylogenetic lineages were recently sequenced. One is Flexistipes sinusarabici MAS 10T, isolated from a deep-sea brine pool on the bottom of the Red Sea (Fiala et al., 1990; Lapidus et al., 2011). It was classified as a member of the Deferribacteres, a deep branch within the Bacteria; it grows between 3 and 10% salt and possibly higher. "
    [Show abstract] [Hide abstract] ABSTRACT: Extremely halophilic microorganisms that accumulate KCl for osmotic balance (the Halobacteriaceae, Salinibacter) have a large excess of acidic amino acids in their proteins. This minireview explores the occurrence of acidic proteomes in halophiles of different physiology and phylogenetic affiliation. For fermentative bacteria of the order Halanaerobiales, known to accumulate KCl, an acidic proteome was predicted. However, this is not confirmed by genome analysis. The reported excess of acidic amino acids is due to a high content of Gln and Asn, which yield Glu and Asp upon acid hydrolysis. The closely related Halorhodospira halophila and Halorhodospira halochloris use different strategies to cope with high salt. The first has an acidic proteome and accumulates high KCl concentrations at high salt concentrations; the second does not accumulate KCl and lacks an acidic proteome. Acidic proteomes can be predicted from the genomes of some moderately halophilic aerobes that accumulate organic osmotic solutes (Halomonas elongata, Chromohalobacter salexigens) and some marine bacteria. Based on the information on cultured species it is possible to understand the pI profiles predicted from metagenomic data from hypersaline environments.
    Full-text · Article · Nov 2013 · Frontiers in Microbiology
  • Source
    • "Sequences of the genera of Denitrovibrio and Geovibrio, covered limited percentages of 1.5% and 0.9% the clone library of produced water from the low temperature oil reservoir No. 6. Denitrovibrio-related bacteria isolated from reservoirs appeared the ability to reduce nitrate by denitrification1731. In the produced water sample from the mesochermic oil reservoir Yan 9, harboured Flexistipes-related bacteria that were closely similar with the isolate originated from a multiply extreme environment with high temperature, high salinity, and high concentrations of heavy metals32, which further confirmed that the Yan 9 oil reservoir might be anaerobic and high insalt content. "
    [Show abstract] [Hide abstract] ABSTRACT: Water-flooded oil reservoirs have specific ecological environments due to continual water injection and oil production and water recycling. Using 16S rRNA gene clone library analysis, the microbial communities present in injected waters and produced waters from four typical water-flooded oil reservoirs with different in situ temperatures of 25°C, 40°C, 55°C and 70°C were examined. The results obtained showed that the higher the in situ temperatures of the oil reservoirs is, the less the effects of microorganisms in the injected waters on microbial community compositions in the produced waters is. In addition, microbes inhabiting in the produced waters of the four water-flooded oil reservoirs were varied but all dominated by Proteobacteria. Moreover, most of the detected microbes were not identified as indigenous. The objective of this study was to expand the pictures of the microbial ecosystem of water-flooded oil reservoirs.
    Full-text · Article · Oct 2012 · Scientific Reports
  • [Show abstract] [Hide abstract] ABSTRACT: In a metagenomic analysis of a stratified hypersaline (9 % salt) microbial mat in Guerrero Negro, Mexico, Kunin et al. (Mol Systems Biol 4:198, 2008) detected a significantly acid-shifted proteome, and concluded that adaptation by enriching proteins with acidic amino acids is more widespread than previously assumed. We here reevaluate these data and conclusions by comparing the isoelectric point profiles of the Guerrero Negro microbial mats (average isoelectric point 6.8) with those of the proteins encoded by the genomes of prokaryotes adapted to different salt concentrations ranges and belonging to different phylogenetic and physiological groups. Average isoelectric points below 6.8 were found not only in the proteomes of the moderately halophilic aerobic bacteria Halomonas elongata and Chromohalobacter salexigens, but even in common types of marine bacteria of the genera Alteromonas and Aliivibrio. We did not find clear evidence that the isoelectric point profile of the Guerrero Negro microbial mat can be considered to be the result of species-independent molecular convergence of the members of the microbial community determined by the salinity of the overlaying brine.
    No preview · Article · Aug 2012 · Extremophiles
Show more