ArticlePDF Available

Abstract and Figures

The paradox of a host specificity of the human faecal microbiota otherwise acknowledged as characterized by global functionalities conserved between humans led us to explore the existence of a phylogenetic core. We investigated the presence of a set of bacterial molecular species that would be altogether dominant and prevalent within the faecal microbiota of healthy humans. A total of 10 456 non-chimeric bacterial 16S rRNA sequences were obtained after cloning of PCR-amplified rDNA from 17 human faecal DNA samples. Using alignment or tetranucleotide frequency-based methods, 3180 operational taxonomic units (OTUs) were detected. The 16S rRNA sequences mainly belonged to the phyla Firmicutes (79.4%), Bacteroidetes (16.9%), Actinobacteria (2.5%), Proteobacteria (1%) and Verrumicrobia (0.1%). Interestingly, while most of OTUs appeared individual-specific, 2.1% were present in more than 50% of the samples and accounted for 35.8% of the total sequences. These 66 dominant and prevalent OTUs included members of the genera Faecalibacterium, Ruminococcus, Eubacterium, Dorea, Bacteroides, Alistipes and Bifidobacterium. Furthermore, 24 OTUs had cultured type strains representatives which should be subjected to genome sequence with a high degree of priority. Strikingly, 52 of these 66 OTUs were detected in at least three out of four recently published human faecal microbiota data sets, obtained with very different experimental procedures. A statistical model confirmed these OTUs prevalence. Despite the species richness and a high individual specificity, a limited number of OTUs is shared among individuals and might represent the phylogenetic core of the human intestinal microbiota. Its role in human health deserves further study.
Content may be subject to copyright.
Towards the human intestinal microbiota
phylogenetic coreemi_1982 2574..2584
Julien Tap,1Stanislas Mondot,1Florence Levenez,1
Eric Pelletier,2,3 Christophe Caron,4
Jean-Pierre Furet,1Edgardo Ugarte,2,3
Rafael Muñoz-Tamayo,1,5,6 Denis L. E. Paslier,2,3
Renaud Nalin,7Joel Dore1and Marion Leclerc1*
1INRA, UEPSD, UR910, 78350 Jouy en Josas, France.
2CEA, DSV, IG, Genoscope, 91057 Evry, France.
3CNRS UMR 8030, 91057 Evry, France.
4INRA, MIG, UR1077, 78350 Jouy en Josas, France.
5INRA, MIA, UR341, 78350 Jouy en Josas, France.
6L2S, UMR8506, Univ. Paris Sud-CNRS-SUPÉLEC,
91190 Gif sur Yvette, France.
7Libragen, 31400 Toulouse, France.
The paradox of a host specificity of the human faecal
microbiota otherwise acknowledged as characterized
by global functionalities conserved between humans
led us to explore the existence of a phylogenetic core.
We investigated the presence of a set of bacterial
molecular species that would be altogether dominant
and prevalent within the faecal microbiota of healthy
humans. A total of 10 456 non-chimeric bacterial 16S
rRNA sequences were obtained after cloning of PCR-
amplified rDNA from 17 human faecal DNA samples.
Using alignment or tetranucleotide frequency-based
methods, 3180 operational taxonomic units (OTUs)
were detected. The 16S rRNA sequences mainly
belonged to the phyla Firmicutes (79.4%), Bacter-
oidetes (16.9%), Actinobacteria (2.5%), Proteobacteria
(1%) and Verrumicrobia (0.1%). Interestingly, while
most of OTUs appeared individual-specific, 2.1%
were present in more than 50% of the samples and
accounted for 35.8% of the total sequences. These 66
dominant and prevalent OTUs included members of
the genera Faecalibacterium,Ruminococcus,Eubac-
terium,Dorea,Bacteroides,Alistipes and Bifidobac-
terium. Furthermore, 24 OTUs had cultured type
strains representatives which should be subjected to
genome sequence with a high degree of priority. Strik-
ingly, 52 of these 66 OTUs were detected in at least
three out of four recently published human faecal
microbiota data sets, obtained with very different
experimental procedures. A statistical model con-
firmed these OTUs prevalence. Despite the species
richness and a high individual specificity, a limited
number of OTUs is shared among individuals and
might represent the phylogenetic core of the human
intestinal microbiota. Its role in human health
deserves further study.
The human gut microbiota is a complex ecosystem, which
is now recognized as a key component in gastrointestinal
tract (GI tract) homeostasis. Its involvement in immune
diseases has recently been demonstrated and bacterial
imbalance or so-called ‘dysbiosis’ has been associated
with pathologies such as inflammatory bowel disease and
obesity (Marteau et al., 2004; Ley et al., 2005; 2006;
Swidsinski et al., 2005). These observations have stirred
a renewed interest into the mechanisms underlying such
imbalances and a search for biomarkers of healthy versus
diseased GI tract microbiota.
Culture-based methods initially provided a basic
knowledge on numbers and diversity of culturable micro-
organisms from human GI tract. Bacterial diversity was
estimated to exceed 400 culturable species and two
archaeal methanogenic species were isolated from
human faecal samples (Savage, 1977; Miller et al., 1982;
Finegold et al., 1983). Molecular analysis based on rDNA
gene structure (Woese et al., 1975; 1990), by targeting
both cultured and uncultured microorganisms, shed light
on microbial diversity (Amann et al., 1995). In human GI
tract, depending on the method, 10–50% microbial
population was reported uncultured (Amann et al., 1995;
Zoetendal et al., 2004; Ley et al., 2006).
The very first 16S rDNA molecular inventories of
healthy human faecal microbiota (Wilson et al., 1997;
Suau et al., 1999) had demonstrated the high diversity of
this ecosystem and pointed to the important number of
molecular species that did not correspond to any cultured
strains from available collections. Improved technical per-
formances have since led to higher numbers of clones
investigated in studied data sets (Eckburg et al., 2005).
Furthermore, within the last few years, metagenomics,
thanks to PCR-free identification, has been offering a new
Received 5 November, 2008; accepted 28 May, 2009. *For
correspondence. E-mail; Tel. (+33) 1
34 65 23 06; Fax (+33)134652492.
Environmental Microbiology (2009) 11(10), 2574–2584 doi:10.1111/j.1462-2920.2009.01982.x
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd
insight into microbial diversity of the dominant microor-
ganisms (Gill et al., 2006; Manichanh et al., 2006). Hence
revisited, the human GI tract microbiota appeared domi-
nated by very few phyla when compared with other
complex ecosystems such as soils and oceans (Cole
et al., 2005), but nonetheless highly diverse and complex
at the level of ‘phylotypes’.
Profiling techniques targeting 16S rRNA genes indi-
cated that the human GI tract microbiota was stable over
time through adulthood (Zoetendal et al., 1998; Sutren
et al., 2000) and resilient to antibiotic treatment (De La
Cochetiere et al., 2005). Most importantly, it showed an
important subject specificity in composition and species
diversity (Zoetendal et al., 1998).
At a macroscopic level, however, the microbiota sup-
ports a common set of metabolic pathways assembled in
a trophic chain common to all healthy individuals (Macfar-
lane and Gibson, 1994), with fermentation of dietary com-
pounds and endogenous substrates, followed by host
absorption and excretion of SCFA (acetate, propionate,
butyrate) and gas. Although the microbiota composition
seems to be host specific, the high degree of conservation
in its expressed functions and metabolites between
humans should translate into conserved features of the
environmental metabolome and proteome, derived from
redundancies in the GI tract microbiota transcriptome and
genome. We hypothesized that this should be supported
by the existence of a bacterial ‘phylogenetic core’ in
healthy adult faecal microbiota, consisting of a set of
dominant and prevalent microbial species. Extensive
molecular inventories of 16S rRNA genes were generated
for the faecal microbiota of 17 healthy individuals. Candi-
date core species present in more than 50% of individuals
in the studied cohort were identified and further validated
against recently published 16S rDNA sequence data sets
of human faecal microbiota from other countries. This
observation should have major implications in human GI
tract microbiomics.
Richness and diversity of human adult faecal microbiota
From the global analysis of the 10 456 sequences, 3180
operational taxonomic units (OTUs) were obtained for the
17 subjects (Table S1). The total number of OTUs differed
by less than 4% according to the analysis software, from
3180 to 3186 with CLUSTALW and MAFFT respectively.
Furthermore, when tetranucleotide frequency method
was used instead of alignment, 3097 OTUs were obtained
(Table S2).
The Chao1 estimation of total richness for the whole
sequences set, whatever the alignment or clustering
method, led to very similar curves (Fig. 1). The cumulative
number of OTUs linearly increased, up to 8000 analysed.
For more than 8000 clones, a plateau seemed to be
reached, indicating that the sampling effort from this data
set allowed the estimation of dominant bacterial richness.
From this analysis, the faecal microbiota of 17 healthy
adults would at least reach 9940 OTUs.
When each subject data set was considered separately,
the average OTUs number per subject was 259, ranging
from 159 to 383 (Table 1). There was no correlation
between OTUs numbers and the number of sequences
Fig. 1. Chao1 estimates of human gut bacterial richness as a function of sample size. Sequences analysis methods: blue, tetranucleotide
frequency; green, alignment with CLUSTALW; red, alignment with MAFFT. Ninety-five per cent confidence intervals were computed with DOTUR.
Given the OUT definition, the total bacterial richness estimated by Chao1 did not significantly differ according to the sequence analysis
methods, because the confidence intervals overlapped at the significance level of 0.05.
Human intestinal microbiota phylogenetic core 2575
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
obtained per individual (r2=0.00056, P=0.7754, Spear-
man method). Unambiguous sequences per individual
ranged from 426 to 899 (Table S1). Rarefaction curves did
not show any plateau except for samples AT and AV
(Fig. S1). In addition, diet did not have a statistically sig-
nificant impact on diversity, since the diversity detected
within the microbiota associated to vegetarian or omnivo-
rous diet did not statistically differ from the overall diversity
(AMOVA calculations, Table S3). The estimated richness
averaged 943 OTUs per subject, and drastically differed
between individuals, ranging from 288 to 1651. At the
subject level, the Chao1 estimated richness did not reach
saturation except for the two samples AT and AV for which
both Chao and Simpson indexes indicated a lower
diversity (Table 1).
Taxonomic description of global and individual libraries
The taxonomic affiliation of the 10 456 sequences 16S
rRNA gene sequences confirmed that the dominant
human faecal microbiota belonged to five phyla, with
79.4% Firmicutes; 16.9% Bacteroidetes; 2.5% Actinobac-
teria;1%Proteobacteria; 0.1% Verrucomicrobia; and
0.1% others (data not shown). Differences were observed
in the taxonomic make-up of the 17 individual libraries.
The proportions of the three major phyla varied, from one
sample with only few sequences related to the Clostridium
leptum cluster, to another sample with only one OTU
belonging to the Bacteroidetes phylum (assigned to the
genus Alistipes). It was noticeable that for most of the
genera, OTUs were not evenly distributed: most OTUs
gathered only few sequences and, conversely, few OTUs
gathered most of the sequences found in the correspond-
ing genus.
Quantitative PCR (qPCR) results were consistent with
molecular inventories data and confirmed this taxonomic
composition of the libraries. The same average com-
position of taxonomic groups was obtained when qPCR
data versus cloning-based sequencing were compared.
Indeed, the Firmicutes members dominated, with
C. leptum cluster IV, Clostridium coccoides cluster XIV
and Bacteroides/Prevotella as the most prevalent groups
(Table S4). When few sequences were assigned to a
group, the qPCR results demonstrated the same trend. At
a subdominant species level, molecular inventories and
qPCR were also consistent for Escherichia coli determi-
nation. However, the qPCR results and the molecular
inventory taxonomic assignment of the sequences from
the genera Lactobacillus and Bifidobacterium were not in
A set of OTUs shared among individuals
Among the 3180 OTUs detected, 2500 OTUs were
present in only one sample, which represented 78.6% of
subject specificity (Fig. 2). All the 680 remaining OTUs
(21.4%) were common to at least two samples. However,
none of the OTUs could be detected in all samples. The
prevalence curve followed an increase towards a limited
number of OTUs detected in more than half of the
samples (Fig. 2). Interestingly, 66 OTUs, representing
2.1% of the total detected OTUs, were present in more
than 50% of the individuals of the study. In addition, they
represented 35.8% of the sequences (3740 sequences).
Table 1. Characteristics of human fecal samples, and sequence data. Fecal samples were from 17 healthy adult individuals, eight males and nine
females, between 28 and 54 years old, living in France or in the Netherlands. Eight individuals followed a vegetarian diet, with various daily intakes
regarding protein sources, dairy products, fibers, from vegetarian to vegan. The others were omnivorous, with also differences in diet. Diets,
country, DNA concentration, chimera checked sequences, sequence accession numbers are detailed in Table S1.
Sample Sex Age
Number of
of OTUs
(Simpson; 1-D)
AA M 39 636 256 886.4 0.9773
AB F 39 468 236 819.4 0.9695
AC M 45 679 276 948.5 0.9876
AD F 34 633 235 580.4 0.9795
AF F 41 619 245 1110.3 0.9802
AG M 33 500 234 532.4 0.9894
AH M 36 426 195 931.3 0.9658
AI F 28 625 285 954.6 0.9841
AL M 54 603 326 1651.1 0.9864
AM F 41 573 254 901.5 0.9881
AN F 31 491 278 1478.0 0.9894
AP F 49 653 383 1294.0 0.9942
AQ M 33 655 271 992.0 0.9449
AR F 31 607 297 797.7 0.9885
AS F 32 550 296 1008.5 0.9908
AT M 37 839 175 343.1 0.9257
AV M 29 899 159 288.0 0.9136
2576 J. Tap et al.
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
These 66 OTUs appeared at the same time more fre-
quently shared among individuals and accounting for
more sequences, indicating that they might represent a
phylogenetic core.
Taxonomic distribution of phylogenetic core OTUs
The diversity originating from the 17 faecal microbiota
was mapped using principal component analysis (PCoA)
(Fig. 3). The core OTUs were not restricted to a specific
genus or even phylum, but fell into distinct phyla and
families, with the prevalent and dominant members of
Bacteroides vulgatus,Roseburia intestinalis,Rumino-
coccus bromii,Eubacterium rectale,Coprobacillus sp.,
Bifidobacterium longum (Fig. 3). The OTU with the
highest prevalence, 16 out of 17 individuals, belonged to
Faecalibacterium prausnitzii. At the opposite, some
OTUs from the core represented by few sequences
appeared less visible, such as an OTU classified as a
Lachnospiraceae, shared by eight subjects but only
represented by 11 sequences. At the same time, one
OTU specific to AT sample was represented by more
than 150 sequences. These observations suggest that
abundance was not invariably related to frequency of
The phylogenetic core of healthy humans’ faecal micro-
biota herein described exhibited representatives of the
main phyla, and the 66 OTUs belonged to 18 genera
(Fig. 4). However, compared with the whole data set, the
Firmicutes phylum was highly represented in the core
(57/66 OTUs), while the Bacteroidetes phylum only
accounted for seven OTUs.
Each individual microbiota contributed to the phyloge-
netic core and harboured an average of 40 OTUs from the
phylogenetic core, ranging from 20 to 49 OTUs (Fig. 4).
AT sample with a lesser diversity [Chao1 =343.115 and
Simpson (1-D) =0.9257] also provided a lesser contribu-
tion to the phylogenetic core. There was, however, no
correlation between the contribution to the core and
the total number of OTUs, per sample (r2=0.1196,
P=0.1739). Each sample harboured core OTUs from the
two main phyla Bacteroidetes,Firmicutes and 14 out of 17
from the Actinobacteria. A similar trend was observed at
the genus level. For instance, except for two of them, all
samples exhibited at least four OTUs assigned to the
genus Faecalibacterium. Similarly, all samples harboured
at least one OTU assigned to the genus Roseburia and to
the Bacteroides (except subject AL).
Fig. 2. Distribution of OTUs as a function of their prevalence in the
17 individuals. Operational taxonomic units were ranked from the
most prevalent (present in 16/17 individuals) to the least prevalent
ones (individual specific). Most prevalent OTUs, present in 8 out of
17 individuals or more, corresponded to 2.1% of all OTUs (n=66)
but represented 35.8% of all sequences (n=3740).
Fig. 3. Principal coordinate analysis of OTUs
from the faecal microbiota of 17 healthy
human individuals. A principal coordinate
analysis was performed using the full distance
matrix. Each OTU was pictured as a disk
whose area was proportional to the number of
sequences and the heat colours accounted
for the prevalence among the 17 individuals.
Operational taxonomic units represented by a
unique sequence (singleton) were not plotted.
Human intestinal microbiota phylogenetic core 2577
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
In addition, when compared with the cultivated type
strains from RDP II, 38 OTUs (58%) were similar to a
cultivated species, with a 2% sequence dissimilarity
threshold (Table S5). Among the Bacteroidetes, the
species were Bacteroides stercoris,B. vulgatus,B. mas-
siliensis,Parabacteroides distasonis,Alistipes putredinis,
Alistipes shahii, and among the Firmicutes, the species
were F. prausnitzii,Ruminococccus obeum,R. bromii,
E. rectale,E. halii,E. eligens,Dorea longicatena. Only
two cultured strains from the Actinobacteria were repre-
sented, B. longum biovar longum and Colinsella aerofa-
ciens. At the opposite, among the 42% not assigned to a
Fig. 4. Taxonomic and prevalence characterization of the phylogenetic core. Sixty-six OTUs present in at least 8 individuals out of 17 were
shown, the black dot representing their detection in a given individual. The taxonomic assignment of the 66 OTUs was obtained using
classifier (RDP II release 9.61). The tree was built using ade4 package in R. ‘Rumino’ and ‘Lachno’ indicated OTUs whose taxonomic
affiliation could only reach the family levels, Lachnospiraceae and Ruminococcaceae respectively.
2578 J. Tap et al.
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
species, 14 OTUs, from the Firmicutes and to a lesser
extent from the Bacteroidetes phylum, were distant by
more than 5% sequence divergence from the closest cul-
tivated type strains.
Statistical characterization of the phylogenetic core
Based on the statistical model and the chosen criterion
(50% of individuals), a subset of 49 OTUs (on a total of
3180 OTUs) was selected as the putative core. These 49
OTUs were the most prevalent among the 66 previously
selected. All core OTUs were described with their corre-
sponding probability estimates, within a 95% confidence
interval and their normalized abundance pj in the
core (Table S6). The calculation of confidence intervals
attached to the probabilities estimation, enabled to evalu-
ate the uncertainty of this assessment of the core. Accord-
ing to the confidence intervals, the 10 most frequent
OTUs, very likely to be part of the core with respect to the
50% threshold, were related to the following species:
F. prausnitzii;Anaerostipes caccae;Clostridium spiro-
forme;Bacteroides uniformis;D. longicatena;B. longum
biovar longum;Clostridium sp. BI-114; Clostridium
bolteae. Furthermore, in order to take into account the
number of sequences per OTU in the core set, the nor-
malized abundance of the OTUs was calculated and
varied from 0.5% to 9%. Ten OTUs with the highest nor-
malized abundance would have an important contribution
to the core, and were affiliated to their closest isolated
type strain from RDP II database (Fig. S2).
Core OTUs presence in external data sets
A systematic comparison of the sequences originating
from this data set against the published libraries was
performed, in order to get a broader estimation of OTU
redundancy (i.e. recovery of the same OTUs in four librar-
ies from other international studies), while taking into
account biases associated with experimental procedures.
From the whole data set, 17% of OTUs were present in
other 16S rRNA libraries, and 83% (3780 sequences)
were specific to this study (Fig. S3).
Strikingly, the 66 OTUs demonstrated a higher preva-
lence in public data sets (Fig. 5). All of them were
detected at least once in the four external libraries, and
78.8% of them (52 OTUs) were detected in at least three
of these four libraries. When the core OTUs highlighted by
the statistical model were subjected to the same analysis,
this occurrence in at least three libraries reached 81.6%.
When the presence in all data sets was the criterion,
24 core OTUs were retrieved. They all belonged to the
Firmicutes, and, for example, the OTUs assigned to the
genus Faecalibacterium were all detected in the four
external libraries. Conversely, the representation of OTU
from other phyla was different: one OTU was only found
in this study and Manichanh and colleagues (2006) and
shared more than 99% of similarity with the species
B. longum (NCC2705 strain). Seven OTUs assigned to
the phylum Bacteroidetes were not found in Gill and
colleagues (2006) library but at least twice in the other
Overall, the criterion chosen for phylogenetic core
determination seemed robust. From the biological data
obtained in this study and in the so far published data
sets, which were confirmed by statistical models, a set of
approximately 50 bacterial species may represent part of
the healthy human phylogenetic core.
The goal of this study was to assess the existence of a
phylogenetic core, consisting of a set of dominant species
prevalent among healthy adults. Because of the recent
demonstrations of strong links between phylogenetic dys-
biosis and health impairment or diseases, such a group of
microorganisms are expected to play a preponderant role
in gut homeostasis and human health.
A precise quantification of the extent of human GI tract
diversity has indeed been a critical ecological question
for more than 30 years. The estimate of 400 cultivated
species (Savage, 1977; Finegold et al., 1983) was
eclipsed by 16S rRNA-targeted molecular studies and
Fig. 5. Venn diagram representation of 66 putative core OTUs hits
against external libraries. The occurrence of the 66 prevalent OTUs
was assessed in the publicly available 16S rRNA libraries.
Sequences originating from healthy individual faecal samples only
were downloaded from GenBank from four external libraries:
Eckburg and colleagues (2005) (2339 sequences); Gill and
colleagues (2006) (2062 sequences); Manichanh and colleagues
(2006) (539 sequences); Li and colleagues (2008) (5413
sequences). BLASTN algorithm was used to determine the OTU
occurrence in external libraries with a minimum coverage of 900
bases pairs and a minimum pairwise identity of 98%. Four-way
Venn diagrams were plotted with VENNY (http://bioinfogp.
Human intestinal microbiota phylogenetic core 2579
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
numbers from several thousands (Eckburg et al., 2005)
up to 40 000 of species have been estimated (Frank et al.,
2007). It remains critical to circumscribe the GI tract
microbial diversity inherent to humans. From this data set,
Chao estimates indicated that the human gut microbiota
richness could reach a saturation corresponding to at
least 10 000 OTUs, which is much higher than previously
reported (Eckburg et al., 2005). Taxonomic make-up of
the libraries was consistent with previous study, even
though in Eckburg and colleagues (2005), the estimated
richness per individual was lower than the least diverse
sample from this study.
In this study, 3180 OTUs were observed and this
appeared as the highest diversity ever obtained with
PCR-based method, and for the first time 17 individuals
were investigated. Furthermore, the OTUs sequences
were all more similar to human GI tract species than to
any other clone sequences from the databases. This sug-
gests a larger trend in microbial evolution that faecal
microbiota communities of same species (conspecific)
appeared more similar to each other than to those of
different host species.
Core OTUs were first chosen as present in more than 8
out of 17 of individuals. The further comparison with pub-
licly available human data sets strongly confirmed the
prevalence of these core OTUs. Strikingly, these experi-
ments sampled the same core OTUs, even though they
were performed worldwide with very different protocols
(sample handling, DNA extraction, Eubacteria-Universal
PCR primers, chimera detection procedure) known to
lead to different pictures of microbial diversity (Suau et al.,
1999; Kurokawa et al., 2007; Li et al., 2008). Most of them
were present in three out of the four available sequences
data sets on healthy human faecal samples, obtained in
Japan or in the USA. The only differences were the under-
representation in other libraries of core sequences related
to Bacteroides and Bifidobacterium genera, whose occur-
rences have already been discussed by Kurokawa and
colleagues (2007) and Suau and colleagues (1999).
In addition to the biological investigations, the probabil-
ity estimates from the binomial distribution of OTUs
enabled to model, as the core set, the 49 most prevalent
OTUs from the primary selection of 66. The calculation of
confidence intervals attached to the probabilities estima-
tion, enabled to evaluate the uncertainty of the assess-
ment of the core. In this way, according to the chosen
criterion (>50% of individuals), the first 10 OTUs with the
highest probabilities were statistically considered to be
part of the core. Additional data would improve the esti-
mation and the narrowing of the confidence intervals
because the uncertainty of the probability estimates is still
high, due to small sample size (n=17). In addition, in the
statistical analyses, no distinction was made between the
sample of OTUs experimentally detected and the real
microbiota. As a consequence, one may expect an under-
estimation of OTUs present at a low abundance level,
close to detection threshold.
The high prevalence of OTUs was also an indication of
the species persistence in the human GI tract, and several
ecological factors could account for it. In terms of condi-
tions linked to the ecosystem, attachment to food par-
ticles, resistance to stress such as pH or mechanical
forces of peristaltic movement, would prevent the species
from a wash-out phenomenon. From a metabolic point of
view, an inference to the putative role of the core species
could be attempted from the close strains that are already
sequenced or characterized. Their known metabolic func-
tions in anaerobic degradation of food polymers or their
immunological properties in relation to the host epithelium
would add critical information on the core putative pro-
teins and metabolites pool. 24 OTUs from the core were
closely related to cultivated type strains from the species
E. rectale,R. bromii,F. prausnitzii,Clostridium sp.
BI-114, B. stercoris,B. vulgatus,P. distasonis,A. putredi-
nis,R. obeum,E. halii,D. longicatena.
Interestingly, a large range of metabolic functions
regarding the carbohydrate catabolism trophic chain were
covered since hydrolytic, fermentative, hydrogenotrophic
properties, and butyrate, lactate or acetate production
could be inferred from OTUs phylogenetic position.
Whether the core OTUs represent a set of species suffi-
cient for anaerobic degradation of dietary fibres remains
to be determined. A large proportion cannot be cultured; it
has, however, been recently shown that assignation of
several metabolic signatures to uncultured microbial
population was possible (Li et al., 2008). This robustness
has indeed been described to be related to the functional
redundancy of a microbial ecosystem.
From these data, however, the diversity structure
appeared to interestingly depend on the genus consid-
ered. Furthermore, the diversity structure at different taxo-
nomic levels can indeed be seen as a way to investigate
the impact of host on community composition. Even
though a 16S rRNA sequence dissimilarity of 3% had
been used for molecular species characterization (Stack-
ebrandt and Goebel, 1994) dissimilarity cut-off varied in
recent reports on human GI tract microbiota (Suau et al.,
1999; Eckburg et al., 2005; Gill et al., 2006). Interestingly,
in this study, the same shape of rarefaction curves was
obtained when the dissimilarity cut-off ranged from 1%
to 5%. Furthermore, tetranucleotide frequency count
(Teeling et al., 2004) also showed the same trend and this
work confirmed that this non-alignment-based method
enabled a fast and accurate phylogenetic assignation. A
similar approach had been previously described, includ-
ing the human GI tract (Rudi et al., 2007).
One interesting outcome of the large number of
sequences per individual performed in this study con-
2580 J. Tap et al.
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
cerned Faecalibacterium genus diversity. Faecalibacte-
rium prausnitzii-related sequences have been repeatedly
recovered among the most prevalent species, and
described as dominant in healthy individuals and under-
represented in patients with inflammatory bowel disease
(Manichanh et al., 2006). Originally described for its
butyrate production (Duncan et al., 2002), its anti-
inflammatory properties have very recently been
described (Sokol et al., 2008). Based on the seven dis-
tinct OTUs identified in more than 50% of the individuals
of this study, we hypothesized a more important phyloge-
netic and functional diversity in this genus, which would
be consistent with the connection of F. prausnitzii-related
sequences to different metabolites (Li et al., 2008).
When diversity was specifically observed at an indi-
vidual level, a strong host adaptation could be empha-
sized. For example, the low number of core OTUs from
the Bacteroidetes phylum may not only be linked to
technical differences between the studies or to lower
sequence number. Recently, the compositional complex-
ity of this genus was highlighted in human gut metage-
nomes (Kurokawa et al., 2007) and similarly, among the
17 individuals of this study, the individual variability among
the Bacteroides genus was particularly high.
As another evidence supporting the core concept, a
very high individual variability was observed, consistent
with earlier works using Ribotyping methods (Zoetendal
et al., 1998; Sutren et al., 2000). Sequence data demon-
strated that 78.6% OTUs were specific of a given
individual. As a confirmation, when these OTUs were
compared with external databases, the prevalence was
not high. Quantitative PCR data revealed the same high
variability, particularly for the Actinobacteria quantity. Fur-
thermore, when the diversity according to age, country of
origin, diet was tested with AMOVA, the individual variabil-
ity, which could be partly random, explained most of the
It meant that the dietary habits (vegetarian versus
omnivorous) did not explain much of the genetic diversity.
In addition, clone frequencies distribution between veg-
etarians and omnivorous, statistically compared using dis-
criminant analysis, only explained 5% of variability. More
samples and time series, together with genomic charac-
terization, are required to assess how diet shapes the
human gut microbiota.
A number of core OTUs were present in all checked
databases, pointing as an outcome of this work to give
high priority for the sequencing of those strains. Refer-
ence genomes are required for the characterization of
human gut microbiome and cultured representatives
‘have to be selected based on comprehensive 16S rDNA
gene based survey’ (Turnbaugh et al., 2007). Twenty-four
OTUs from the core were close to cultivated type strains,
with some of them already being sequenced. However,
the numerous OTUs far from cultivated strains should
also be targeted using cell-sorting strategies and new
single-cell sequencing technologies.
Metagenomic data sets have already started to shed
light on the functional redundancy between healthy indi-
viduals (Gill et al., 2006; Kurokawa et al., 2007). Future
studies on larger individual cohorts will enable to explore
the link between gene redundancy and the prevalence of
members of the putative phylogenetic core. Statistical
models, as developed in this study, are also required in a
broader perspective, to estimate sampling depth and
number of individuals needed to characterize the ‘full’
human microbiome.
It is now recognized that microbial groups’ imbalance
can be linked to diseases. This work, together with others,
leads towards a set of species important for human
health. If confirmed, the main outcomes of this work will
be the design and application of a fast screening of the
phylogenetic core as a diagnostic tool. The next step for a
better understanding will be to assess how the transfor-
mation of human lifestyle influences the microorganisms
evolution and thereby health and predisposition to various
Experimental procedures
Subjects and sampling
The 17 study subjects were healthy adults between 29 and
54 years old, male and female, living in France or in the
Netherlands (Table 1). Eight subjects followed a vegetarian
diet, with various daily intakes regarding protein sources,
dairy products, fibres, constituting a panel from vegetarian to
vegan diet. The nine other subjects were omnivorous, with
also differences in diet. Faecal samples were stored in sterile
Sarstedt tube at -80°C until further processing. None of the
volunteers had received antibiotic treatment 6 months prior to
Extraction of genomic DNA
Total DNA was extracted from 0.2 g of faecal samples, using
a bead-beating method as previously described (Godon
et al., 1997). The DNA preparation for AV sample was per-
formed as previously described (Courtois et al., 2003). DNA
concentration and purity was estimated by gel electrophore-
sis and spectrometry (NanoDrop).
Bacterial 16S rRNA amplification
The 16S rDNA genes were amplified from extracted DNA
using bacterial primers U-350f (5-CTCCTACGGGAGG
CAGCAGT-3) (Amann et al., 1990) and P-1392r (5-
GCGGTGTGTACAAGACCC-3) (Kane et al., 1993). PCR
reactions were run as previously described (Suau et al.,
1999), using AmpliTaq Gold DNA Polymerase (Applied Bio-
systems) and a PTC 100 Thermocycler (MJ Research).
Human intestinal microbiota phylogenetic core 2581
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
Three PCR products from each extracted DNA sample were
pooled and purified using Qiaquick PCR purification kit
columns (Qiagen), checked and stored at -20°C.
Cloning and sequencing
Cloning and sequencing were performed at the national
sequencing centre CEA-Genoscope (Evry, France). Purified
PCR products were ligated into pCR-4TOPO TA vectors and
electroporated into E. coli DH10B-T1 cells, according to
the manufacturer’s recommendation (Invitrogen). A total of
1500 colonies from each transformation were randomly
picked. Bidirectional Sanger sequence reads were trimmed
and assembled by PHRED-PHRAP (http://www.phrap.
org/phredphrapconsed.html). Sequences orientation were
checked using BLASTN (Altschul et al., 1997) against the RDP
II database. One per cent ambiguous nucleotide was toler-
ated for sequences with 900 bp length cut-off.
Sequences analysis and OTU representative
sequences detection
Chimera check was performed using MALLARD software
(Ashelford et al., 2006). From 15 532, a strict elimination led
to 10 456 unambiguous sequences, which were then analy-
sed using RapidOTU (Legrand et al., 2008). RapidOTU,
freely available at, and
offering up to 64 processors upon request, is a perl-script
written pipeline, connecting software for automatic analysis of
16S rRNA genes libraries. Multiple alignment was obtained
with CLUSTALW (Thompson et al., 1994; Li, 2003) or MAFFT
algorithm (Katoh et al., 2005). The computing of a precise
alignment of the 10 456 sequences on 1317 gapped base
pairs was possible by using a perl-script program enabling
the parallelization of CLUSTALW. The distance matrixes (F84
model) were computed by fdnadist (PHYLIP package: http:// (Felsentein,
1989). Tetranucleotide frequency count using OCOUNT
(Teeling et al., 2004), implemented within the RapidOTU
pipeline, was also used to cluster the sequences, and
Pearson matrixes were built and converted into distance
matrixes. Operational taxonomic units were detected using
DOTUR (Schloss and Handelsman, 2005) with a default
2% sequence dissimilarity cut-off. RepOTUfinder, a newly
designed tool implemented in RapidOTU, automatically
selected and extracted a representative sequence for each
OTU by calculating the central sequence, the ones with
the lowest distance with all the other OTUs sequences.
The 10 456 sequences have been submitted to DDBJ/
EMBL/GenBank databases under the accession numbers
(FP074904 to FP085359).
Ecology analysis and core phylogenic detection
Ecology analyses were performed on the individual and on
the complete 16S rDNA data set. DOTUR files were used to
map rarefaction curves and to compute Chao1 estimated
OTU richness profiles. Simpson indices (1-D) of variability
between samples were obtained from the phylotypes abun-
dances. To assess diet impact on genetic diversity, AMOVA
was computed using ade4 statistical package (Chessel et al.,
Genetic diversity of the whole data set was represented by
a PCoA analysis, computed using R software (http://pbil.univ- The distance matrix of the 3180 OTUs rep-
resentative sequences was computed using the SeqinR
package (Charif and Lobry, 2007) and transformed into an
Euclidean matrix before the PCoA analysis.
Operational taxonomic unit prevalence was determined as
the sum of their occurrence in the 17 individual 16S rRNA
gene libraries. Taxonomic characterization of the OTUs was
performed using the RDP II Classifier program (RDP II
Release 9.58) and diagram computation with the ade4 sta-
tistical package (Chessel et al., 2004). The similarity between
core OTUs sequences and isolated type strains was obtained
by BLASTN against the 5171 isolated type strains 16S rDNA
sequences from RDP II.
16S rRNA gene qPCR
Quantitative PCR was performed on 16 of the faecal DNAs
using probes and settings previously described (Furet et al.,
2009). Quantitative PCR systems targeted Eubacteria, and
within the Firmicutes C. leptum group (Clostridium cluster IV),
C. coccoides group (Clostridium cluster XIV), Bacteroides–
Prevotella,E. coli,F. prausnitzii (Sokol et al., 2008),
Lactobacillus–Leuconostoc and Bifidobacterium.
Statistical detection of a putative phylogenetic core
Assuming that there was not dependence between individu-
als, a statistical model was used to define a putative phy-
logenetic core. The presence/absence of the OTUs was
represented as a binomial distribution based on the preva-
lence, where gjdenoted the probability that the OTU jis
detected in an individual (details in Appendix S1) (Wilson,
1927; Agresti and Coull, 1998). The parameter gjdid not
provide information about the abundance of the OTUs in the
global data set. In order to also have a representation of the
abundance, the numbers of sequences of each OTU were
averaged on the subset of individuals where the OTU was
detected. Afterwards, the average abundances were nor-
malized to have a unitary representation of the core.
Detection of core OTUs in external data sets
From the four published studies on human microbiota, the
16S rRNA gene sequences linked to healthy adult faecal
samples were selected and downloaded from GenBank.
Comparisons of the 3180 OTUs or the 66 core OTUs were
performed using BLASTN with 98% identity threshold and a
900 bases minimum coverage for a given pairwise aligned
sequences. Results were shown in a four-way Venn diagram
plotted with VENNY (
We are very grateful to Dr E. Zoetendal (Laboratory of Micro-
biology, Wageningen University, the Netherlands) for provid-
2582 J. Tap et al.
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
ing us with samples and nutritional information; to Dr K. Kiêu
(MIA, INRA, France) for helpful discussions on the statistical
approach. J. Tap’s PhD and this project are supported by the
French National Agency for Research, ANR/DEDD/PNRA/
PROJ/200206-01-01, within the AlimIntest program.
Agresti, A., and Coull, B.A. (1998) Approximate is better than
exact for interval estimation of binomial proportions. Am
Statistician 52: 119–125.
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J.,
Zhang, Z., Miller, W., and Lipman, D.J. (1997) Gapped
BLAST and PSI-BLAST: a new generation of protein database
search programs. Nucleic Acids Res 25: 3389–3402.
Amann, R.I., Binder, B.J., Olson, R.J., Chisholm, S.W.,
Devereux, R., and Stahl, D.A. (1990) Combination of 16S
rRNA-targeted oligonucleotide probes with flow cytometry
for analyzing mixed microbial populations. Appl Environ
Microbiol 56: 1919–1925.
Amann, R.I., Ludwig, W., and Schleifer, K.H. (1995) Phyloge-
netic identification and in situ detection of individual micro-
bial cells without cultivation. Microbiol Rev 59: 143–169.
Ashelford, K.E., Chuzhanova, N.A., Fry, J.C., Jones, A.J.,
and Weightman, A.J. (2006) New screening software
shows that most recent large 16S rRNA gene clone librar-
ies contain chimeras. Appl Environ Microbiol 72: 5734–
Charif, D., and Lobry, J.R. (2007) SeqinR 1.0-2: A Contrib-
uted Package to the R Project for Statistical Computing
Devoted to Biological Sequences Retrieval and Analysis.
New York, USA: Springer Verlag.
Chessel, D., Dufour, A.-B., and Thioulouse, J. (2004) The
ade4 package-I – One-table methods. R News 4: 5–10.
Cole, J.R., Chai, B., Farris, R.J., Wang, Q., Kulam, S.A.,
McGarrell, D.M., et al. (2005) The Ribosomal Database
Project (RDP-II): sequences and tools for high-throughput
rRNA analysis. Nucleic Acids Res 33: D294–D296.
Courtois, S., Cappellano, C.M., Ball, M., Francou, F.X.,
Normand, P., Helynck, G., et al. (2003) Recombinant envi-
ronmental libraries provide access to microbial diversity for
drug discovery from natural products. Appl Environ Micro-
biol 69: 49–55.
De La Cochetiere, M.F., Durand, T., Lepage, P., Bourreille,
A., Galmiche, J.P., and Dore, J. (2005) Resilience of the
dominant human fecal microbiota upon short-course anti-
biotic challenge. J Clin Microbiol 43: 5588–5592.
Duncan, S.H., Hold, G.L., Harmsen, H., Stewart, C.S., and
Flint, H.J. (2002) Growth requirements and fermentation
products of Fusobacterium prausnitzii, and a proposal to
reclassify it as Faecalibacterium prausnitzii gen. nov.,
comb. nov. Int J Syst Evol Microbiol 52: 2141–2146.
Eckburg, P.B., Bik, E.M., Bernstein, C.N., Purdom, E., Deth-
lefsen, L., Sargent, M., et al. (2005) Diversity of the human
intestinal microbial flora. Science 308: 1635–1638.
Felsentein, J. (1989) PHYLIP – Phylogeny Inference Package
(Version 3.2). Cladistics 5: 164–166.
Finegold, S.M., Sutter, V.L., and Mathisen, G.E. (1983)
Normal indigenous intestinal flora. In Human Intestinal
Microflora in Health and Disease. Hentges, D.J. (ed.). New
York, USA: Academic Press, pp. 3–31.
Frank, D.N., St. Amand, A.L., Feldman, R.A., Boedeker, E.C.,
Harpaz, N., and Pace, N.R. (2007) Molecular-phylogenetic
characterization of microbial community imbalances in
human inflammatory bowel diseases. Proc Natl Acad Sci
USA 104: 13780–13785.
Furet, J.P., Firmesse, O., Gourmelon, M., Bridonneau, C.,
Tap, J., Mondot, S., et al. (2009) Comparative assessment
of human and farm animal faecal microbiota using real-
time quantitative PCR. FEMS Microbiol Ecol 19: 19.
Gill, S.R., Pop, M., DeBoy, R.T., Eckburg, P.B., Turnbaugh,
P.J., Samuel, B.S., et al. (2006) Metagenomic analysis
of the human distal gut microbiome. Science 312: 1355–
Godon, J.J., Zumstein, E., Dabert, P., Habouzit, F., and
Moletta, R. (1997) Molecular microbial diversity of an
anaerobic digestor as determined by small-subunit rDNA
sequence analysis. Appl Environ Microbiol 63: 2802–2813.
Kane, M.D., Poulsen, L.K., and Stahl, D.A. (1993) Monitoring
the enrichment and isolation of sulfate-reducing bacteria
by using oligonucleotide hybridization probes designed
from environmentally derived 16S rRNA sequences. Appl
Environ Microbiol 59: 682–686.
Katoh, K., Kuma, K., Toh, H., and Miyata, T. (2005) MAFFT
version 5: improvement in accuracy of multiple sequence
alignment. Nucleic Acids Res 33: 511–518.
Kurokawa, K., Itoh, T., Kuwahara, T., Oshima, K., Toh, H.,
Toyoda, A., et al. (2007) Comparative metagenomics
revealed commonly enriched gene sets in human gut
microbiomes. DNA Res 14: 169–181.
Legrand, L., Tap, J., Gauthey, C., Doré, J., Caron, C., and
Leclerc, M. (2008) Rapid OTU: a fast pipeline to analyze
16S rDNA sequences by alignment or tetranucieotide fre-
quency. Proc. Gut Microbiome Symp. 2008 6th Congr.
INRA Rowett Res. Inst., poster 26, pp. 35.
Ley, R.E., Backhed, F., Turnbaugh, P., Lozupone, C.A.,
Knight, R.D., and Gordon, J.I. (2005) Obesity alters gut
microbial ecology. Proc Natl Acad Sci USA 102: 11070–
Ley, R.E., Turnbaugh, P.J., Klein, S., and Gordon, J.I. (2006)
Microbial ecology: Human gut microbes associated with
obesity. Nature 444: 1022.
Li, K.B. (2003) ClustalW-MPI: ClustalW analysis using dis-
tributed and parallel computing. Bioinformatics 19: 1585–
Li, M., Wang, B., Zhang, M., Rantalainen, M., Wang, S.,
Zhou, H., et al. (2008) Symbiotic gut microbes modulate
human metabolic phenotypes. Proc Natl Acad Sci USA
105: 2117–2122.
Macfarlane, G.T., and Gibson, G.R. (1994) Metabolic activi-
ties of normal colonic flora. In Human Health: The Contri-
bution of Microorganisms. Gibson, S.A.W. (ed.). London,
UK: Springer Verlag, pp. 17–52.
Manichanh, C., Rigottier-Gois, L., Bonnaud, E., Gloux, K.,
Pelletier, E., Frangeul, L., et al. (2006) Reduced diversity
of faecal microbiota in Crohn’s disease revealed by a
metagenomic approach. Gut 55: 205–211.
Marteau, P., Lepage, P., Mangin, I., Suau, A., Dore, J.,
Pochart, P., and Seksik, P. (2004) Gut flora and inflamma-
tory bowel disease. Aliment Pharmacol Ther 20 (Suppl. 4):
Miller, T.L., Wolin, M.J., de Macario, E.C., and Macario, A.J.
Human intestinal microbiota phylogenetic core 2583
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
(1982) Isolation of Methanobrevibacter smithii from human
feces. Appl Environ Microbiol 43: 227–232.
Rudi, K., Zimonja, M., Kvenshagen, B., Rugtveit, J., Midtvedt,
T., and Eggesbo, M. (2007) Alignment-independent com-
parisons of human gastrointestinal tract microbial commu-
nities in a multidimensional 16S rRNA gene evolutionary
space. Appl Environ Microbiol 73: 2727–2734.
Savage, D.C. (1977) Microbial ecology of the gastrointestinal
tract. Annu Rev Microbiol 31: 107–133.
Schloss, P.D., and Handelsman, J. (2005) Introducing
DOTUR, a computer program for defining operational taxo-
nomic units and estimating species richness. Appl Environ
Microbiol 71: 1501–1506.
Sokol, H., Pigneur, B., Watterlot, L., Lakhdari, O., Bermudez-
Humaran, L.G., Gratadoux, J.J., et al. (2008) Faecalibac-
terium prausnitzii is an anti-inflammatory commensal
bacterium identified by gut microbiota analysis of Crohn
disease patients. Proc Natl Acad Sci USA 20: 20.
Stackebrandt, E., and Goebel, B.M. (1994) Taxonomic note:
a place for DNA–DNA reassociation and 16S rRNA
sequence analysis in the present species definition in bac-
teriology. Int J Syst Bacteriol 44: 846–849.
Suau, A., Bonnet, R., Sutren, M., Godon, J.J., Gibson, G.R.,
Collins, M.D., and Dore, J. (1999) Direct analysis of genes
encoding 16S rRNA from complex communities reveals
many novel molecular species within the human gut. Appl
Environ Microbiol 65: 4799–4807.
Sutren, M., Michel, C., de la Cochetière, M.F., Bernalier, A.,
Wils, D., Saniez, M.H., and Doré, J. (2000) Temporal tem-
perature gradient gel electrophoresis (TTGE) is an appro-
priate tool to assess dynamics of species diversity of the
human fecal flora. Reprod Nutr Dev 40: 176.
Swidsinski, A., Weber, J., Loening-Baucke, V., Hale, L.P.,
and Lochs, H. (2005) Spatial organization and composition
of the mucosal flora in patients with inflammatory bowel
disease. J Clin Microbiol 43: 3380–3389.
Teeling, H., Waldmann, J., Lombardot, T., Bauer, M., and
Glockner, F.O. (2004) TETRA: a web-service and a
stand-alone program for the analysis and comparison of
tetranucleotide usage patterns in DNA sequences. BMC
Bioinformatics 5: 163.
Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994)
CLUSTAL W: improving the sensitivity of progressive multiple
sequence alignment through sequence weighting, position-
specific gap penalties and weight matrix choice. Nucleic
Acids Res 22: 4673–4680.
Turnbaugh, P.J., Ley, R.E., Hamady, M., Fraser-Liggett,
C.M., Knight, R., and Gordon, J.I. (2007) The human
microbiome project. Nature 449: 804–810.
Wilson, E.B. (1927) Probable inference, the law of succes-
sion, and statistical inference. J Am Stat Assoc 22: 209–
Wilson, K.H., Ikeda, J.S., and Blitchington, R.B. (1997) Phy-
logenetic placement of community members of human
colonic biota. Clin Infect Dis 25: S114–S116.
Woese, C.R., Fox, G.E., Zablen, L., Uchida, T., Bonen, L.,
Pechman, K., et al. (1975) Conservation of primary struc-
ture in 16S ribosomal RNA. Nature 254: 83–86.
Woese, C.R., Kandler, O., and Wheelis, M.L. (1990) Towards
a natural system of organisms: proposal for the domains
Archaea,Bacteria, and Eucarya.Proc Natl Acad Sci USA
87: 4576–4579.
Zoetendal, E.G., Akkermans, A.D., and De Vos, W.M. (1998)
Temperature gradient gel electrophoresis analysis of 16S
rRNA from human fecal samples reveals stable and host-
specific communities of active bacteria. Appl Environ
Microbiol 64: 3854–3859.
Zoetendal, E.G., Collier, C.T., Koike, S., Mackie, R.I., and
Gaskins, H.R. (2004) Molecular ecological analysis of
the gastrointestinal microbiota: a review. J Nutr 134: 465–
Supporting information
Additional Supporting Information may be found in the online
version of this article:
Fig. S1. Rarefaction curves of operational taxonomic unit
(OTU) detection per sample. Operational taxonomic units
were defined with 2% dissimilarity cut-off, for homogeneous
sequences of 1042 bases from nucleotides 350–1392 (E. coli
16S rRNA gene numbering) and fully aligned on 1317 bases
including gaps.
Fig. S2. Phylogenetic core based on statistical model. Each
fraction corresponded to an OTU that is part (%) of the
phylogenetic core. Ten OTUs were highlighted because of
their occurrence in the phylogenetic core.
Fig. S3. Venn diagram representation of 10 456 sequences
set (A) and the 3180 OTUs (B) hits against external libraries.
Four-way Venn diagrams were plotted with VENNY (http:// BLASTN algo-
rithm was used to determine the OTU occurrence in external
libraries with a minimum coverage of 900 bases pairs and a
minimum pairwise identity of 98%. A total of 550 OTUs (6676
sequences) were found in other 16S rRNA libraries; 2630
OTUs (3780 sequences) were specific to this study.
Table S1. Characteristics of human faecal samples studied,
DNA concentration, total sequences, unambiguous sequen-
ces and sequences accession number per individual.
Table S2. Number of OTUs and estimated richness
assessed on the complete sequences data set according to
the alignment or tetranucleotide frequency algorithms.
Table S3. Analysis of molecular variance (AMOVA) between
omnivorous and vegetarian diets.
Table S4. Quantitative PCR assays on 16 healthy human
faecal samples.
Table S5. 16S rDNA sequence similarity between core OTU
representative and sequences from isolated strains.
Table S6. Probability estimation and confidence interval for
each OTU in the core to be part of the microbiota.
Appendix S1. Statistical detection of a putative phyloge-
netic core.
Please note: Wiley-Blackwell are not responsible for the
content or functionality of any supporting materials supplied
by the authors. Any queries (other than missing material)
should be directed to the corresponding author for the
2584 J. Tap et al.
© 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology,11, 2574–2584
... In terms of bacterial composition, the healthy human fecal microbiome is composed of five dominant bacterial phyla. Strict anaerobic bacteria from the phyla Firmicutes and Bacteroidetes are the most relatively abundant while Actinobacteria, Proteobacteria and Verrumicrobia constitute a minority in healthy individuals (Tap et al. 2009). To note, the low abundance of the facultative anaerobic Proteobacteria is indicative of a healthy gut microbiota (Hollister, Gao, and Versalovic 2014). ...
... Over time, the decrease in the luminal oxygenation allows obligate anaerobic species to expand and outnumber facultative bacteria. Within 3 years after birth, infant fecal microbiota reaches a mature adult-like profile (Yatsunenko et al. 2012), characterized by preponderance of the strict anaerobes Lachnospiraceae and Ruminococcacea (from the phyla Firmicutes), and Bacteroidaceae, Prevotellaceae and Rikenellaceae (from the phyla Bacteroidetes), with low abundance of Actinobacteria, Proteobacteria and Verrumicrobia (Tap et al. 2009;Sanidad and Zeng 2020) (Fig. 11). ...
Mucosal associated invariant T (MAIT) cells are an evolutionary conserved T cell subset, which recognize riboflavin precursor derivatives presented by MR1. Intestinal bacteria from the Bacteroidetes and Proteobacteria phyla can produce MAIT antigens, suggesting a direct interplay between MAIT cells and the microbiota. In humans, MAIT cells have been implicated in pathologies associated with intestinal dysbiosis such as inflammatory bowel diseases, but the role of MAIT cells in these diseases remains unknown. Increased levels of oxygen and oxidative stress are hallmarks of intestinal inflammation, and riboflavin contributes to bacterial respiration and to oxidative stress resistance, suggesting increased needs for riboflavin during intestinal inflammation. Supporting this hypothesis, bacteria possessing the riboflavin biosynthesis pathway are enriched in Crohn’s disease patients.In the first part of this work, we explored the hypothesis that MAIT cells monitor a bacterial metabolic pathway associated with altered gut ecosystem. We showed that hypoxia disruption in the colon, upon antibiotic treatment or dextran sodium sulfate (DSS)-induced colitis, increased MAIT antigen production by the microbiota. We sequenced the 16S rRNA genes from the cecum of antibiotic- and DSS-treated mice, and analyzed expression of the riboflavin pathway genes in the Helicobacter hepaticus model of colitis. Dysanaerobiosis was associated with expansion of Enterobacteriaceae upon vancomycin treatment, and Bacteroidaceae and Enterobacteriaceae in colitic mice. Riboflavin production provided a fitness advantage to Escherichia coli in the inflamed intestine. Both Bacteroidaceae and Enterobacteriaceae can produce high amounts of MAIT antigens in vitro. In the H. hepaticus model of colitis, ribD, which controls MAIT antigen production, was over-expressed by Bacteroidaceae, Clostridiaceae and Enterobacteriaceae during colitis. MAIT antigens crossed the intestinal barrier and induced T cell receptor (TCR) signalling in MAIT cells, which produced the tissue-repair mediator amphiregulin and reduced colitis severity. In the second part of this work, we characterized MAIT cells in the ileal environment, wherein high levels of oxygen have been observed at steady state. A range of bacteria that encode the riboflavin pathway populated the ileum, however ileal MAIT cells were not activated through TCR engagement and expressed distinct phenotypic profiles. We discuss the future studies that will be needed to understand MAIT-microbiota interactions in the ileum.Collectively, in the colon, MAIT cells directly sense and react to changes in bacterial metabolism associated with intestinal inflammation and provide host protection in return. This new host-microbiota interaction may explain MAIT cell activation in other pathologies associated with dysbiosis such as colorectal cancer.
... Figure 3C shows the increased genus of bacteria in ISS and Figure 3D the decreased bacteria in ISS. Of these, The genus Faecalibacterium is one of the most abundant and important symbiotic bacteria in the human gut microbiota, accounting for approximately 5% of the total bacteria in feces and belongs to the family Ruminococcaceae (28,29). The genus Eubacterium is also one of the core genera of the human gut microbiota, which is widely found in the human intestinal tract and has high specificity and adaptability to the human intestinal tract (30,31). ...
... These bacteria are major components of the gut microbiome. As mentioned, the genus Faecalibacterium, a member of the Ruminococcaceae family, is one of the most abundant and important symbiotic bacteria in the human gut microbiota (28,29). It is an important indicator of intestinal health and maintenance of intestinal homeostasis and decreased Faecalibacterium have been associated with inflammatory bowel disease, and diabetes diseases (32). ...
Full-text available
Background The gut microbiome is important for host nutrition and metabolism. Whether the gut microbiome under normal diet regulate human height remains to be addressed. Our study explored the possible relationship between gut microbiota, its metabolic products and the pathogenesis of idiopathic short stature disease (ISS) by comparing the gut microbiota between children with ISS and of normal height, and also the short-chain fatty acids (SCFAs) produced by the gut microbiota. Methods The subjects of this study were 32 prepubescent children aged 4-8 years. The fecal microbial structure of the subjects was analyzed by 16S rRNA high-throughput sequencing technology. The concentrations of SCFAs in feces were determined by gas chromatography-mass spectrometry. Results The richness of gut microbiota in ISS group was decreased, and the composition of gut microbiota was significantly different between ISS group and control group. The relative abundance of nine species including family Ruminococcaceae and genera Faecalibacterium and Eubacterium , in ISS group was significantly lower than that in control group (P<0.05). The relative abundance of 10 species, such as those belonging to genus Parabacteroides and genus Clostridium , in ISS group was significantly higher than that in control group (P<0.05). The concentration of total SCFAs and butyrate in ISS group was significantly lower than that in control group. The correlation analysis among different species, clinical indicators, and SCFAs showed that the relative abundance of family Ruminococcaceae and genera Faecalibacterium and Eubacterium was positively correlated with the standard deviation score of height. Furthermore, the concentrations of total SCFAs and butyrate were positively correlated with serum insulin-like growth factor 1 (IGF-1)-SDS. Disease prediction model constructed based on the bacteria who abundance differed between healthy children and ISS children exhibited high diagnostic value (AUC: 0.88). Conclusions The composition of gut microbiota and the change in its metabolite levels may be related to ISS pathogenesis. Strains with increased or decreased specificity could be used as biomarkers to diagnose ISS.
... The human microbiota is composed of different microbes including archaea, viruses, protozoa, and mainly bacteria which inhabit different parts of the human body for example in GIT, reproductive organs, skin, oral cavity, and the respiratory tract [1]. Three major phyla including Actinobacteria, Firmicutes, and Bacteroidetes primarily colonized human GIT microbiota [7]. Different pathways including neuroendocrine, metabolic, and immune interactions are helpful in the regulation and stabilization of the relationship between host and intestinal microbiota. ...
Full-text available
A huge diversity of microbial species continues to live with the human beings that are collectively known as microbiota. Several environmental factors can impact the microbial imbalance in the intestine which can play a starring role in health and disease conditions in humans. In this review, we have described the role of human microbiota in the individual's susceptibility to infectious diseases such as gastrointestinal, respiratory, and female reproductive tract infections. Here, we have discussed how the indigenous microbiota interacts with the host and the invader microorganisms including bacteria, fungi, and viruses, and can modify the outcome of infections. The complex mechanisms of colonization resistance mediated by the microbiota as a direct and indirect way to fight against the infectious agents have been highlighted. Moreover, the approaches for the modulation of human microbiota for the prevention or therapeutic management of infectious diseases have been discussed especially the potential therapies directly targeting the microbiota such as pro-biotics, prebiotics, as well as fecal microbiota transplantation. Further studies need to focus on the complex interactions between the host and microbial species which could be helpful for a better understanding of the hidden potential of gut microbiota in the physiology of the host and could provide novel therapeutic targets and approaches.
... g mice, with a decrease in bacterial diversity, which is known to be associated with intestinal inflammation and hepatic diseases [35]. Bacteroidetes and Firmicutes are the dominant microbiota in the gut [36]. Here, the ratio of Bacteroidetes to Firmicutes was found to be increased in SC-P. ...
Full-text available
Background: Periodontitis is a chronic multifactorial inflammatory disease. Porphyromonas gingivalis is a primary periopathogen in the initiation and development of periodontal disease. Evidence has shown that P. gingivalis is associated with systemic diseases, including IBD and fatty liver disease. Inflammatory response is a key feature of diseases related to this species. Methods: C57BL/6 mice were administered either PBS, or P. gingivalis. After 9 weeks, the inflammatory response in gut, spleen, and liver was analyzed. Results: The findings revealed significant disturbance of the intestinal microbiota and increased inflammatory factors in the gut of P. gingivalis-administered mice. Administrated P. gingivalis remarkably promoted the secretion of IRF-1 and activated the inflammatory pathway IFN-γ/STAT1 in the spleen. Histologically, mice treated with P. gingivalis exhibited hepatocyte damage and lipid deposition. The inflammatory factors IL-17a, IL-6, and ROR-γt were also upregulated in the liver of mice fed with P. gingivalis. Lee's index, spleen index, and liver index were also increased. Conclusion: These results suggest that administrated P. gingivalis evokes inflammation in gut, spleen, and liver, which might promote the progression of various systemic diseases.
... The human gut microbiota was composed of nine kinds of bacterial phyla, and dominated by four: Bacteroidetes, Firmicutes, Actinobacteria, and Proteobacteriat (Tap et al., 2009). Although there was no significant difference of the four bacterial phyla between the asymptomatic gallstone patients and controls in our study, we found that the abundance of Ruminococcaceae_UCG-008, Sutterella, GCA-900066755, Butyricicoccus, unclassified_o_Lactobacillales, and Lachnospiraceae_ND3007_group were significantly decreased in asymptomatic gallstone patients at the genus level. ...
Full-text available
There are few studies on the changes of gut microbiota in patients with gallstones, especially in patients with asymptomatic gallstones, and there are some deficiencies in these studies, for instance, the effects of metabolic factors on gut microbiota are not considered. Here, we selected 30 asymptomatic gallstone patients from the survey population, and 30 controls according to the age and BMI index matching principle. The 16SrDNA technology was used to detect and compare the structural differences in the gut microbiota between the two groups. Compared with healthy controls, the abundance of gut microbiota in patients with gallstones increased significantly, while the microbiota diversity decreased. At the level of phylum, both groups were dominated by Firmicutes , Bacteroides , Proteobacteria , and Actinobacteria . At the genus level, there were 15 species with significant differences in abundance between the two groups. Further subgroup analysis found that only unclassified Lactobacillales showed differences in the intestines of gallstones patients with hypertension, non-alcoholic fatty liver disease, or patients with elevated BMI (≧24). The structure of gut microbiota in patients with gallstones changed significantly, and this might be related to the occurrence of gallstones, rather than metabolic factors such as hypertension, non-alcoholic fatty liver disease, and obesity.
... Ruminococcoceae and Lachnospiraceae are both common gut endosymbionts and important in anaerobic digestion of plant compounds including ruminant cellulolytic digestions [89]. These two families are some of the most abundant in gut environments and contain a diverse array of fibrolytic enzymes [90][91][92][93]. Taken together, our nenue microbiota reveal novel microbial diversity in the midgut and strongly support the claim that terrestrial and marine herbivores are united in their use of anaerobic, fermentative bacterial gut endosymbionts to digest plant matter [32]. ...
Full-text available
Background Gut microorganisms aid in the digestion of food by providing exogenous metabolic pathways to break down organic compounds. An integration of longitudinal microbial and chemical data is necessary to illuminate how gut microorganisms supplement the energetic and nutritional requirements of animals. Although mammalian gut systems are well-studied in this capacity, the role of microbes in the breakdown and utilization of recalcitrant marine macroalgae in herbivorous fish is relatively understudied and an emerging priority for bioproduct extraction. Here we use a comprehensive survey of the marine herbivorous fish gut microbial ecosystem via parallel 16S rRNA gene amplicon profiling (microbiota) and untargeted tandem mass spectrometry (metabolomes) to demonstrate consistent transitions among 8 gut subsections across five fish of the genus of Kyphosus. Results Integration of microbial phylogenetic and chemical diversity data reveals that microbial communities and metabolomes covaried and differentiated continuously from stomach to hindgut, with the midgut containing multiple distinct and previously uncharacterized microenvironments and a distinct hindgut community dominated by obligate anaerobes. This differentiation was driven primarily by anaerobic gut endosymbionts of the classes Bacteroidia and Clostridia changing in concert with bile acids, small peptides, and phospholipids: bile acid deconjugation associated with early midgut microbiota, small peptide production associated with midgut microbiota, and phospholipid production associated with hindgut microbiota. Conclusions The combination of microbial and untargeted metabolomic data at high spatial resolution provides a new view of the diverse fish gut microenvironment and serves as a foundation to understand functional partitioning of microbial activities that contribute to the digestion of complex macroalgae in herbivorous marine fish.
Changes in the composition of the gut microbiota are associated with many human diseases. So far, however, we have failed to define homeostasis or dysbiosis by the presence or absence of specific microbial species. The composition and function of the adult gut microbiota is governed by diet and host factors that regulate and direct microbial growth. The host delivers oxygen and nitrate to the lumen of the small intestine, which selects for bacteria that use respiration for energy production. In the colon, by contrast, the host limits the availability of oxygen and nitrate, which results in a bacterial community that specializes in fermentation for growth. Although diet influences microbiota composition, a poor diet weakens host control mechanisms that regulate the microbiota. Hence, quantifying host parameters that control microbial growth could help define homeostasis or dysbiosis and could offer alternative strategies to remediate dysbiosis.
In this review, we aim to summarise key articles that explore relationships between the gut and ocular surface microbiomes (OSMs) and immune-mediated dry eye. The gut microbiome has been linked to the immune system by way of stimulating or mitigating a proinflammatory or anti-inflammatory lymphocyte response, which may play a role in the severity of autoimmune diseases. Although the ‘normal’ gut microbiome varies among individuals and demographics, certain autoimmune diseases have been associated with characteristic gut microbiome changes. Less information is available on relationships between the OSM and dry eye. However, microbiome manipulation in multiple compartments has emerged as a therapeutic strategy, via diet, prebiotics and probiotics and faecal microbial transplant, in individuals with various autoimmune diseases, including immune-mediated dry eye.
Full-text available
The seqinR package for the R environment is a library of utilities to retrieve and analyze biological sequences. It provides an interface between: (i) the R language and environment for statistical computing and graphics, and (ii) the ACNUC sequence retrieval system for nucleotide and protein sequence databases such as GenBank, EMBL, SWISS-PROT. ACNUC is very efficient in providing direct access to subsequences of biological interest (e.g., protein coding regions, tRNA, or rRNA coding regions) present in GenBank and in EMBL. Thanks to a simple query language, it is then easy under R to select sequences of interest and then use all the power of the R environment to analyze them. The ACNUC databases can be locally installed but they are more conveniently accessed through a web server to take advantage of centralized daily updates. The aim of this chapter is to provide a handout on basic sequence analyses under seqinR with a special focus on multivariate methods.
Full-text available
In the emerging field of environmental genomics, direct cloning and sequencing of genomic fragments from complex microbial communities has proven to be a valuable source of new enzymes, expanding the knowledge of basic biological processes. The central problem of this so called metagenome-approach is that the cloned fragments often lack suitable phylogenetic marker genes, rendering the identification of clones that are likely to originate from the same genome difficult or impossible. In such cases, the analysis of intrinsic DNA-signatures like tetranucleotide frequencies can provide valuable hints on fragment affiliation. With this application in mind, the TETRA web-service and the TETRA stand-alone program have been developed, both of which automate the task of comparative tetranucleotide frequency analysis. Availability: TETRA provides a statistical analysis of tetranucleotide usage patterns in genomic fragments, either via a web-service or a stand-alone program. With respect to discriminatory power, such an analysis outperforms the assignment of genomic fragments based on the (G+C)-content, which is a widely-used sequence-based measure for assessing fragment relatedness. While the web-service is restricted to the calculation of correlation coefficients between tetranucleotide usage patterns of submitted DNA sequences, the stand-alone program generates a much more detailed output, comprising all raw data and graphical plots. The stand-alone program is controlled via a graphical user interface and can batch-process a multitude of sequences. Furthermore, it comes with pre-computed tetranucleotide usage patterns for 166 prokaryote chromosomes, providing a useful reference dataset and source for data-mining. Up to now, the analysis of skewed oligonucleotide distributions within DNA sequences is not a commonly used tool within metagenomics. With the TETRA web-service and stand-alone program, the method is now accessible in an easy to use manner for a broad audience. This will hopefully facilitate the interrelation of genomic fragments from metagenome libraries, ultimately leading to new insights into the genetic potentials of yet uncultured microorganisms.
The conventional view of the human large bowel as an appendage of the digestive tract, whose principal purpose was the conservation of salt and water and the disposal of waste materials, is increasingly being replaced with that of a highly specialised digestive organ, which through the activities of its constituent microbiota rivals the liver in its metabolic capacity and in the diversity of its biochemical transformations.
The frequent discrepancy between direct microscopic counts and numbers of culturable bacteria from environmental samples is just one of several indications that we currently know only a minor part of the diversity of microorganisms in nature. A combination of direct retrieval of rRNA sequences and whole-cell oligonucleotide probing can be used to detect specific rRNA sequences of uncultured bacteria in natural samples and to microscopically identify individual cells. Studies have been performed with microbial assemblages of various complexities ranging from simple two-component bacterial endosymbiotic associations to multispecies enrichments containing magnetotactic bacteria to highly complex marine and soil communities. Phylogenetic analysis of the retrieved rRNA sequence of an uncultured microorganism reveals its closest culturable relatives and may, together with information on the physicochemical conditions of its natural habitat, facilitate more directed cultivation attempts. For the analysis of complex communities such as multispecies biofilms and activated-sludge flocs, a different approach has proven advantageous. Sets of probes specific to different taxonomic levels are applied consecutively beginning with the more general and ending with the more specific (a hierarchical top-to-bottom approach), thereby generating increasingly precise information on the structure of the community. Not only do rRNA-targeted whole-cell hybridizations yield data on cell morphology, specific cell counts, and in situ distributions of defined phylogenetic groups, but also the strength of the hybridization signal reflects the cellular rRNA content of individual cells. From the signal strength conferred by a specific probe, in situ growth rates and activities of individual cells might be estimated for known species. In many ecosystems, low cellular rRNA content and/or limited cell permeability, combined with background fluorescence, hinders in situ identification of autochthonous populations. Approaches to circumvent these problems are discussed in detail.