ArticlePDF Available

Understanding Bias in Microbial Community Analysis Techniques due to rrn Operon Copy Number Heterogeneity

Authors:

Abstract

Molecular tools based on rRNA (rrn) genes are valuable techniques for the study of microbial communities. However, the presence of operon copy number heterogeneity represents a source of systematic error in community analysis. To understand the types and magnitude of such bias, four commonly used rrn-based techniques were used to perform an in silico analysis of a hypothetical community comprised organisms from the Comprehensive Microbial Resource database. Community profiles were generated, and diversity indices were calculated for length heterogeneity PCR, automated ribosomal integenic spacer analysis, denaturing gradient gel electrophoresis, and terminal RFLP (using RsaI, MspI, and HhaI). The results demonstrate that all techniques present a quantitative bias toward organisms with higher copy numbers. In addition, techniques may underestimate diversity by grouping similar ribotypes or overestimate diversity by allowing multiple signals for one organism. The results of this study suggest that caution should be used when interpreting rrn-based community analysis techniques.
Laurel D. Crosby and Craig
S. Criddle
Stanford University, Stanford,
CA, USA
ABSTRACT
Molecular tools based on rRNA (rrn)
genes are valuable techniques for the study
of microbial communities. However, the
presence of operon copy number hetero-
geneity represents a source of systematic er-
ror in community analysis. To understand
the types and magnitude of such bias, four
commonly used rrn-based techniques were
used to perform an in silico analysis of a
hypothetical community comprised organ-
isms from the Comprehensive Microbial Re-
source database. Community profiles were
generated, and diversity indices were calcu-
lated for length heterogeneity PCR, auto-
mated ribosomal intergenic spacer analy-
sis, denaturing gradient gel electrophoresis,
and terminal RFLP (using RsaI, MspI, and
HhaI). The results demonstrate that all
techniques present a quantitative bias to-
ward organisms with higher copy numbers.
In addition, techniques may underestimate
diversity by grouping similar ribotypes or
overestimate diversity by allowing multiple
signals for one organism. The results of this
study suggest a degree of caution should be
used when interpreting rrn-based communi-
ty analysis techniques.
INTRODUCTION
Microbial ecology addresses a vari-
ety of issues that range from the
function of a single population to the
myriad interactions of complex com-
munities. Researchers in this field have
the added the challenge of scope be-
cause microbial diversity and commu-
nity dynamics must be inferred using
indirect methods. Unfortunately, the
tools for discriminating and measuring
microbial populations are far from
ideal. One of the challenges is that mi-
crobial communities are exceedingly
diverse, with estimates suggesting be-
tween 4000 and 10 000 different micro-
bial genomes per gram of soil or sedi-
ment (1,2). These estimates are based
on DNA-DNA re-annealing curves, yet
traditional isolation techniques have re-
covered only a small fraction of this es-
timated diversity (3–5). Limitations to
culturing include the inability to predict
the proper culture medium to select un-
known organisms, and the propensity
of fast-growing organisms to outgrow
and overshadow the more relevant or-
ganisms that grow more slowly. The
development of molecular approaches
to community analysis have circum-
vented the need for cultivation because
phylogenetically informative DNA se-
quences can be directly screened from
the environment.
The most widely used techniques
for organism identification and com-
munity analysis include those based on
16S rRNA (rrn) genes because of the
quality of phylogenetic information,
rapid and straightforward procedures,
and large databases of sequence infor-
mation. Despite the advantages of ribo-
somal DNA sequence analysis for stud-
ies of bacterial isolates, limitations ex-
ist for using rRNA genes to analyze
mixed communities (6–9). Problems
arise as a result of organisms that have
variable numbers of copies of the rrn
operon (10,11) and sequence hetero-
geneity between operons (12). While
intracistronic heterogeneity has been
cited as a source of “noise” for deter-
mining the phylogenetic rank of an iso-
late (12), the influence of rrn operon
copy number and sequence heterogene-
ity on community analysis techniques
is much more serious. The problem is
that microbial communities, in almost
all instances, are mixtures of unknown
organisms with unknown numbers of
copies of the rrn operon.
Techniques for the rapid evaluation
of community diversity have several
features in common. First, total genom-
ic DNA is extracted from an environ-
mental sample, and sequences of 16S
genes or intergenic spacer regions are
copied and amplified above the back-
ground of the genome using PCR.
These copies of DNA are then subject-
ed to various discrimination methods,
including electrophoretic separation of
fragments based on length, melting
temperature, or restriction fragment
lengths. The number of different elec-
trophoretic bands or peaks in the analy-
sis serves as a proxy for diversity, as
the different ribosome types (ribotypes)
are considered unique to a group of or-
ganisms. [Note that the term “ribotype”
is used here in terms of a class of RNA,
as opposed to the ribotype theory of the
origin of life (13).] For all techniques,
Research Report
2 BioTechniques Vol. 34, No. 4 (2003)
Understanding Bias in Microbial Community
Analysis Techniques due to rrn Operon Copy
Number Heterogeneity
BioTechniques 34:__-__ (April 2003)
the signal intensity of a particular peak
reflects the number of copies of the
DNA fragment that contribute to that
peak. Unfortunately, the presence of
variable numbers of operons for organ-
isms in diverse communities leads to a
mixture of overlapping signals, multi-
ple signals for single populations, and
distorted estimates of abundance be-
tween organisms. The result is a com-
plicated portrait of community diversi-
ty that is difficult to interpret. To
acknowledge the potential biases in
these techniques, authors prudently ad-
vise readers to “interpret these data
with caution.”
For researchers to develop sound
experimental designs, accurate hy-
potheses, and meaningful conclusions
regarding community structure and
function, sources of systematic error in
community analysis techniques must
be identified and quantified. Biases re-
lated to genomic DNA extraction and
PCR amplification are well document-
ed (14–21). The goal of this paper is to
illustrate how rrn operon copy number
heterogeneity influences the interpreta-
tion of four commonly used 16S
rDNA-based community analysis tech-
niques. A hypothetical community was
constructed using gene sequences re-
trieved from the Comprehensive Mi-
crobial Resource (CMR) database,
which is a collection of completely se-
quenced and annotated genomes com-
piled by The Institute of Genomic Re-
search (TIGR) (Rockville, MD, USA).
DNA sequences encoding the 16S
rRNA gene and adjacent spacer regions
were used for an in silico comparison
of four major community analysis tech-
niques: length heterogeneity PCR (LH-
PCR) (22), automated ribosomal inter-
genic spacer analysis (ARISA) (8), de-
naturing gradient gel electrophoresis
(DGGE) (23), and terminal RFLP
analysis (T-RFLP; with restriction en-
zymes RsaI, MspI, and HhaI) (9). Se-
quences were analyzed according to the
priming sequences and discrimination
methods for each technique. Diversity
indices were calculated based on the
observed distribution of fragment sizes
(or melting temperatures for DGGE)
and then compared with the ideal or
“true” diversity indices for the hypo-
thetical community. The results demon-
strate that rrn operon copy number het-
erogeneity strongly influences the
interpretation of 16S rDNA-based
community analysis techniques.
MATERIALS AND METHODS
The CMR database, maintained by
TIGR (www.tigr.org), was the source of
the genome sequences analyzed in this
report. The CMR database was selected
over GenBank
®
because it contains all
of the rrn operons for each organism.
The organisms evaluated had rrn oper-
on copy numbers ranging from 1 to 10,
with 47% of organisms having either
one or two operons and 71% having up
to four (Figure 1). This distribution of
operon frequency in the CMR database
approximates the operon frequency for
organisms in the Ribosomal RNA
Operon Database (rrndb) Release 2.3
reported by Klappenbach (24). The
CMR database provided coordinates for
the sequence location of the rrn genes,
which allowed for
intergenic spacer
regions to be re-
trieved intact with
the 16S rRNA
genes. Sequences
were retrieved us-
ing the segment
retrieval function
on the CMR Web
site, where coordi-
nates for the 5 end
of the 16S gene
and 5000 bases
downstream were
used to retrieve
the segment. For
operons with the typical configuration
of rRNA genes (16S, 23S, and 5S), this
reflected the entire 16S rRNA gene, the
intergenic spacer region between the
16S and 23S rRNA genes, and a portion
of the 23S rRNA gene. The analysis in-
cluded all microbial species for which
operons were reported but was limited
to those organisms in which PCR
primers matched potential targets by at
least 65% for at least two techniques.
With this constraint, members of the
domain Archaea were excluded from
the analysis. In addition, the hypotheti-
cal community was limited to a single
strain of a given species in the event
that multiple strains were reported. This
reduced the magnitude of the rrn oper-
on copy number bias that could be at-
tributed to the characteristics of any
particular species.
The sequences for forward and re-
verse primers, as previously published
for each method, were used to search
for corresponding sites within the DNA
sequence. When these sites were found,
subsequences were extracted that rep-
resented the fragment between the 5
end of the forward primer and the 3
end of the reverse primer. The lengths
of these fragments were recorded for
the LH-PCR and ARISA techniques,
while fragments for DGGE and T-
RFLP underwent further manipulation.
Sequence fragments for DGGE were
exported to the Winmelt software pack-
age (MedProbe AS, Oslo, Norway) to
estimate the T
m
of the lowest melting
domain. For T-RFLP, sequences for a
given restriction enzyme were used to
identify the fragment length between
the 5 end of the forward primer and the
first instance of the restriction enzyme
cutting site. For all techniques, the frag-
ment lengths (or melting temperatures,
as in the case of DGGE) were plotted as
histograms to simulate the electro-
pherogram type of output that is com-
mon to the automated techniques.
DGGE differs in this regard but was
similarly plotted to facilitate compar-
isons between the techniques.
To reduce bias unrelated to copy
number, all members of the hypotheti-
cal community were assumed to be
equally abundant, with an equal ratio of
genomes. This eliminates the complica-
tion of variable cell densities or growth
rates and allows for meaningful com-
Vol. 34, No. 4 (2003) BioTechniques 3
Figure 1. Distribution of operon copy number frequency between the
CMR database and rrndb (Release 2.3). At this time, the CMR comprises
45 different species entries, while the rrndb contains 287 entries.
parison of diversity indices. A final as-
sumption was that there was no PCR
amplification bias as a result of primer
annealing efficiency. The 65% similari-
ty cutoff between the primer and poten-
tial targets represents a PCR amplifica-
tion reaction with low stringency. In
reality, the type of systematic error at-
tributed to primer bias is a serious com-
plication of PCR-based community
analysis techniques (17,21) and only
exacerbates the errors contributed by
rrn operon copy number heterogeneity.
Diversity indices were calculated
for each technique, based on the ob-
served fragment distribution, where
each unique fragment type represented
a particular ribotype. The Shannon-
Weaver index was used as a diversity
index and was calculated as follows:
H = -Σ(p
i
)(log p
i
) [Eq. 1]
where the summation is over all unique
fragments i, and p
i
is the proportion of
an individual “peak height” (i.e., num-
ber of same-sized fragments) relative to
the sum of all peak heights (i.e., total
number of fragments). Richness is the
number of unique fragment sizes (or
melting temperatures) identified by each
technique. No minimum signal intensity
threshold was used to determine peak
richness, although cutoffs are common-
ly applied to the interpretation of real
community analysis data. Evenness, or
the equitability of the observed ribo-
types, was calculated from the Shannon-
Weaver diversity function, where:
E = H/ln (richness). [Eq. 2]
At the same time, true values for
richness, evenness, and the Shannon-
Weaver index were calculated based on
the known species composition of the
hypothetical community, assuming that
each species would contribute only one
ribotype for each technique. For this
hypothetical community, 41 ribotypes
were present at equal abundance. The
observed values for the diversity in-
dices were then compared with the true
values to gain insight into the type and
magnitude of bias as a result of operon
copy number.
RESULTS
Table 1 presents the amplification
product lengths, melting temperatures,
and restriction fragment lengths for
each organism. The table is organized
such that the discriminating power of
the different techniques can be evaluat-
ed for any given organism, while gen-
eralizations about each technique can
be made by perusing the columns. En-
tries highlighted in bold represent frag-
ments that have two or more members
in common, while the number of frag-
ments of each length are presented in
parentheses. The histograms in Figure
2 represent the distribution of ribotypes
for each particular technique. The
scales for frequency distribution were
kept relevant to each technique, while
the gridlines for the vertical axis were
set to 10 U. This allows for a visual
comparison of the scales between the
different techniques.
LH-PCR gave hypothetical amplifi-
cation products with lengths ranging
314–371 bp, with an average product
length of 346 bp and standard deviation
of 13 bp. The technique produced a to-
tal of 26 unique product lengths for the
hypothetical community, where 14
(54%) represented unique peaks, and
12 (46%) peaks contained fragments
from two or more organisms. Of those
peaks with multiple contributors, the
number of organisms within the peak
ranged from two to eight. Fragments of
348 bp were highest in frequency, with
eight organisms contributing 40 copies
of the LH-PCR fragment to this peak.
In addition, there were 11 organisms
that contributed 52 copies of the frag-
ment within a size range of 5 bp
(352–356 bp). The incidence of in-
teroperon heterogeneity (heterogeneity
within the same organism) was relative-
ly low, with only five organisms with
more than one fragment length. Of the
organisms with heterogeneous copies
of length heterogeneity fragments, four
had fragments that differed by only one
base pair. Bacillus subtilis was the ex-
ception, with four unique fragment
lengths of 352, 353, 354, and 355 bp.
For ARISA, the hypothetical ampli-
fication products ranged 308–1576 bp,
with an average of 751 bp (
SD 239). The
product lengths were typically unique,
with six instances in which a maximum
of two to three organisms contributed to
a peak. Three organisms did not con-
tribute an amplification product because
of the orientation of the 16S and 23S
rRNA genes within the operon or be-
cause they lacked a sufficiently similar
priming site (less than the 65% se-
quence similarity criterion). Those or-
ganisms that did not contribute a prod-
uct tended to have a low operon copy
number, falling in the range from one to
two copies. Of the organisms with mul-
tiple operons, only Deinococcus radio-
durans gave an instance of a missed
product for one of its operons. Despite
the presence of organisms with no
hypothetical product, ARISA yielded
several peaks that exceeded the true
richness of the community. The hypo-
thetical community of 41 organisms
gave a total of 68 peaks. For organisms
with multiple rrn operons, the number
of unique amplification products was
almost equal to the number of operons
(Table 1). For example, Staphylococcus
aureus gave five unique product lengths
for six operons, and B. halodurans gave
six distinctly different product lengths
for each of its six operons. These length
differences were more than 1–2 bp, as
would be indicative of a minor insertion
or deletion event. In many instances, the
length differences were tens or hun-
dreds of bases apart, likely correspond-
ing to the presence or absence of vari-
ous tRNA sequences in the intergenic
spacer region (25). This result demon-
strates a combination of two systematic
errors for the ARISA technique: (i) the
underestimation of community diversi-
ty through missing or overlapping se-
quences and (ii) the overestimation of
diversity due to heterogeneous amplifi-
cation product lengths for a single or-
ganism.
The profile for DGGE showed a
range of melting temperatures from
70.4°C to 79.4°C, with an average tem-
perature of 74.0°C and standard devia-
tion of 1.7°C. The analysis gave a total
of 32 different melting temperatures,
with nine temperatures representing
amplification products from multiple
organisms. Of the peaks with multiple
contributors, seven contained only two
members and the other two contained
four members. The incidence of melt-
ing temperature heterogeneity for a sin-
gle organism was low, with only five
organisms that gave multiple signals.
For those with multiple temperatures,
the differences were frequently limited
to a tenth of a degree.
For the T-RFLP analysis, three en-
Research Report
4 BioTechniques Vol. 34, No. 4 (2003)
Vol. 34, No. 4 (2003) BioTechniques 5
Organism LH-PCR ARISA DGGE T-RFLP
Rsa
I T-RFLP
Hha
I
Agrobacterium tumefaciens
314 (4) 1494 (1), 1575 (1), 75.2 (4) 824 (4) 339 (4)
1576 (2)
Aquifex aeolicus
371 (2) 607 (2) 79.4 (2) 503 (2) 22 (2)
B. halodurans
354 (6) 965 (1), 1010 (1), 75.5 (1), 75.6 (5) 485 (5), 656 (1) 240 (6)
1091(1), 1135 (1),
1254 (1), 1281(1)
B. subtilis
352 (1), 353 (1), 448 (1), 449 (4), 74.8 (7), 454 (1), 455 (1), 238 (1), 240 (8),
354 (7), 355 (1) 452 (3), 629 (2) 75.2 (3) 456 (5), 457 (1), 241 (1)
475 (2)
Borrelia burgdorferi
N N 67.6 (1) 28 (1) 437 (1)
Brucella mellitensis
314 (3) 1048 (3) 75.8 (3) 106 (3) 61 (3)
Campylobacter jejuni
346 (3) 1074 (3) 74.2 (3) 453 (3) 98 (3)
Caulobacter crescentus
316 (2) 969 (2) 74.6 (2) 422 (2) 332 (2)
Chlamydia pneumoniae
356 (1) 513 (1) 71.6 (1) 106 (1) 734 (1)
C. trachomatis
357 (2) 531 (2) 74.5 (2) 488 (2) 735 (2)
Clorobium tepidum
342 (2) 737 (2) 76.2 (2) 465 (2) 91 (2)
C. perfringens
347 (9), 348 (1) 466 (1), 468 (4), 75.2 (9), 75.3 (1) 453 (9), 454 (1) 233 (10)
469 (1), 700 (2),
702 (2)
D. radiodurans
329 (3) N (1), 308 (1), 75.7 (3) 448 (3) 82 (3)
1022 (1)
Enterococcus
366 (4) 508 (2), 609 (1), 73.8 (4) 903 (4) 218 (4)
faecalis
610 (1)
E. coli
348 (7) 636 (1), 637 (1), 74.6 (7) 427 (7) 373 (7)
713 (1), 719 (2),
722 (1), 728 (1)
Hemophilus influenzae
348 (6) 758 (3), 1003 (3) 71.0 (6) 463 (6) 364 (6)
Helicobacter pylori
333 (1), 334 (1) N 71.3 (2) 846 (2) 99 (2)
Listeria innocua
356 (6) 529 (4), 779 (2) 72.6 (6) 435 (6) 186 (6)
Listeria monocytogenes
356 (6) 528 (4), 779 (2) 72.6 (6) 435 (6) 186 (6)
Mesorhizobium loti
314 (2) 1197 (2) 75.2 (2) 682 (2) 61 (2)
Mycobacterium leprae
356 (1) 569 (1) 76.7 (1) 307 (1) 193 (1)
Mycobacterium tuberculosis
344 (1) 559 (1) 77.6 (1) 638 (1) 201 (1)
Mycoplasma genitalium
343 (1) 482 (1) 70.4 (1) 475 (1) 226 (1)
Mycoplasma pulmonis
346 (1) 569 (1) 72.5 (1) 477 (1) 841 (1)
Neisseria meningitidis
348 (4) 946 (4) 74.4 (4) 126 (4) 213 (4)
Nostoc sp.
315 (4) 569 (1), 796 (3) 72.2 (4) 424 (4) 228 (4)
Porphyromonas gingivalis
353 (4) 1037 (4) 72.0 (4) 318 (4) 102 (4)
Pseudomonas aeruginosa
342 (4) 753 (4) 72.9 (4) 644 (4) 155 (4)
Ralstonia solanacearum
346 (4) 784 (4) 74.7 (4) 477 (4) 572 (4)
Rickettsia conorii
330 (1) N 73.0 (1) 132 (1) 1060 (1)
Salmonella enterica
348 (6), 349 (1) 636 (4), 797 (3) 75.7 (7) 427 (6), 428 (1) 373 (6), 374 (1)
S. aureus
355 (6) 586 (1), 620 (1), 72.2 (6) 486 (6) 238 (6)
648 (1), 757 (1),
830 (2)
Table 1. Hypothetical Fragments Retrieved for LH-PCR, ARISA, DGGE, and T-RFLP
zymes were used to generate terminal
restriction fragments: RsaI, MspI, and
HhaI. Each restriction enzyme was
used to generate an independent T-
RFLP profile of the hypothetical com-
munity. The range of fragment lengths
for RsaI was 28–903 bp; for MspI, the
range was 24–566 bp; and HhaI frag-
ments ranged 22–1113 bp. The average
fragment lengths for RsaI, MspI, and
HhaI were 495 (
SD 188), 300 (SD 184),
and 313 (
SD 210) bp, respectively. For
brevity, only RsaI and HhaI fragments
were included in Table 1. T-RFLP
analysis with all three enzymes showed
examples of interoperon heterogeneity
within a single organism and overlap-
ping fragment lengths from multiple
organisms. Most examples of in-
teroperon heterogeneity occurred as a
result of fragment lengths that differed
by a single base pair. B. halodurans
was the only example of an organism
that possessed operons that had dis-
tinctly different restriction sites, which
resulted in two widely divergent frag-
ment lengths. For the RsaI enzyme, one
of the six operons of B. halodurans dif-
fered by 171 bases; whereas for MspI,
two operons out of six differed by 389
bases. This is an interesting result be-
cause this pattern emerged from two
different CT transitions, disrupting
cut sites for two different restriction en-
zymes. Instances of overlapping signals
were also observed for the T-RFLP
analysis, with RsaI giving seven peaks
with multiple contributors, three peaks
for MspI, and five for HhaI.
Richness was estimated as the num-
ber of different ribotypes presented by
each technique. The true value for
species richness for the hypothetical
community was 41, while the observed
richness values for the four techniques
ranged from 26 (LH-PCR) to 68
(ARISA) (Table 2). DGGE gave a rich-
ness estimate of 32 members, while T-
RFLP gave values of 42, 40, and 38 for
RsaI, MspI, and HhaI, respectively. In
this hypothetical community, the true
measure for evenness was equal to 1.0,
as all species were equally abundant.
ARISA ranked highest with a value of
0.951, followed by DGGE (0.900), T-
RFLP (0.904 for RsaI, 0.885 for MspI,
and 0.881 for HhaI), and finally, LH-
PCR (0.814). The Shannon-Weaver di-
versity index was used to account for
the abundance and evenness of ribo-
types generated by each technique. A
comparison of values for this index
showed that most techniques underesti-
mated diversity, with the exception of
ARISA (due to biases noted earlier).
All other values fell below the true val-
ue of 3.714. The diversity indices for
each of the techniques, in rank order,
are as follows: ARISA (4.012), T-
RFLP RsaI (3.378), T-RFLP HhaI
(3.206), T-RFLP MspI (3.263), DGGE
(3.120), and LH-PCR (2.653).
DISCUSSION
Clearly, the rrn operon copy number
has an effect on community analysis
techniques based on 16S rRNA genes.
Some of these techniques tend to com-
bine signals into a single peak, while
others tend to generate multiple signals
for a single organism. However, for all
of these techniques, the fact that an or-
ganism has multiple copies of an oper-
on leads to a quantitative bias for that
organism. The magnitude of this bias
depends on several factors, including
the range of fragment sizes generated
by the primer sets, the region of the rrn
operon amplified, and the discriminat-
ing power of capillary and gel elec-
trophoresis.
LH-PCR suffers most from overlap-
ping signals because it requires the dis-
crimination of small base pair differ-
ences. As an example, the original
reference for LH-PCR illustrates how
amplified products from soils form a
contiguous distribution (22). This tech-
nique has been previously cited as a
tool for quick assessments of changes,
but the meaning of any such change is
difficult to assess. For example, the loss
of a high copy number organism would
result in a more pronounced response
than the loss of a low copy number or-
ganism. Thus, more attention would be
drawn to drastic changes in large peaks,
rather than subtle changes that may be
Research Report
6 BioTechniques Vol. 34, No. 4 (2003)
Organism LH-PCR ARISA DGGE T-RFLP
Rsa
I T-RFLP
Hha
I
S. pneumoniae
352 (4) 529 (4) 74.2 (4) 889 (4) 579 (4)
S. pyogenes 354 (6) 704 (6) 73.6 (1), 73.7 (5) 629 (5), 630 (1) 581 (5), 582 (1)
Synechocystis sp.
317 (2) 746 (2) 73.5 (2) 425 (2) 1048 (2)
Thermotoga maritima
351 (1) 525 (1) 77.1 (1) 86 (1) 1113 (1)
Treponema pallidum
352 (1), 353 (1) 578 (1), 588 (1) 75.6 (2) 639 (1), 640 (1) 850 (1), 851 (1)
Ureaplasma urealyticum
345 (2) 573 (2) NA 283 (2) 370 (2)
Vibrio cholerae
348 (8) 707 (1), 713 (2), 71.6 (5), 71.7 (1), 427 (8) 213 (8)
792 (2), 793 (1), 72.2 (2)
968 (1), 994 (1)
Xylella fastidiosa
348 (2) 746 (2) 72.2 (2) 479 (2) 373 (2)
Yersinia pestis
348 (6) 746 (3), 806 (3) 75.5 (6) 884 (6) 373 (6)
Values represent amplified PCR fragment lengths for LH-PCR and ARISA, melting temperatures of the lowest melting do-
mains for DGGE fragments, and restriction fragment lengths for T-RFLP (
Rsa
I and
Hha
I only,
Msp
I not included). The number
of copies of each fragment are noted in parentheses, and the values in bold represent fragments that overlap in size with one
or more organisms within the technique.
Table 1. Hypothetical Fragments Retrieved for LH-PCR, ARISA, DGGE, and T-RFLP continued
equally or more relevant.
ARISA has also been cited as a
valuable tool due to its simplicity and
rapidity. Use of the intergenic spacer
region has the advantage that ISR frag-
ment banding patterns confer a finer
degree of phylogenetic resolution for
microbial isolates compared to frag-
ment analysis of 16S rRNA genes. Un-
fortunately, heterogeneity in the lengths
of intergenic spacer regions is a serious
complication for studies of mixed com-
munities. The magnitude of the prob-
lem is strongly influenced by the num-
ber of organisms with high copy
number, as opposed to the number of
organisms with fewer copies. Of the 22
organisms that gave a singular re-
sponse, the aver-
age copy number
was 2.45 (
SD 1.37),
compared to the
copy number aver-
age of the whole
population, which
was 3.67 (SD 2.49).
For comparison,
the 16 organisms
that gave multiple
signals had an av-
erage of 5.88
copies of the oper-
on (
SD 2.25). An-
other observation
is that the organ-
isms that did not give a signal (because
of sequence variation relative to the
“universal” PCR priming sites) also
tended to have a low copy number.
Again, this suggests that organisms of
potential relevance to the function of
the community may be overlooked.
Techniques such as DGGE and T-
RFLP also demonstrate copy number
bias, but to a lesser extent than LH-
PCR or ARISA. For DGGE, hetero-
geneity between operons was typically
limited to a single base change between
the DGGE fragments, which corre-
sponded to a temperature difference of
0.1°C. The technique also displayed
examples of overlapping sequences as
a result of two or more organisms shar-
ing the same ribotype. One point to
consider for the hypothetical DGGE
analysis is that melting temperatures
were estimates rather than discreet val-
ues. Thus, in this hypothetical scenario,
the bias due to overlapping fragments
may be greater or less than the bias ob-
served experimentally. Another consid-
eration for the hypothetical analysis of
the DGGE technique is that DNA frag-
ments are normally separated and visu-
alized by gel electrophoresis, rather
than automated capillary electrophore-
sis. Thus, the degree of resolution of a
gel may differ from the electrophero-
gram type of output that is common to
the other techniques.
T-RFLP analysis with each of the
three restriction enzymes showed het-
erogeneity between operons but was
typically limited to 1–2 bp. This corre-
sponds to minor insertion or deletion
events (indels) occurring between the
operons. The disruption of a restriction
Vol. 34, No. 4 (2003) BioTechniques 7
Method Shannon-Weaver Index Richness Evenness
LH-PCR 2.653 26 0.814
ARISA 4.012 68 0.951
DGGE 3.120 32 0.900
T-RFLP (
Rsa
I) 3.378 42 0.904
T-RFLP (
Msp
I) 3.263 40 0.885
T-RFLP (
Hha
I) 3.206 38 0.881
Ideal (species level) 3.714 41 1.000
Calculations for the ideal values were based on an a community with all popula-
tions at equal abundance.
Table 2. Diversity Indices for the Hypothetical Community
Figure 2. Frequency distribution of fragments generated by LH-PCR,
ARISA, DGGE, and T-RFLP (RsaI). Scales for the x-axis are relative to
each technique, while the gridlines for the y-axis were set to units of 10.
Figure 3. Plot of Shannon-Weaver index versus richness for communi-
ties of equally abundant populations. The relative positions of LH-PCR,
ARISA, DGGE, and T-RFLP indicate how far the techniques deviate from
the true value for these indices.
enzyme cutting site was also observed
in the hypothetical community, al-
though the incidence of this type of
mutation was much less frequent. Re-
garding the discriminating power of T-
RFLP, the wider range of possible frag-
ment lengths led to a finer level of
separation than for LH-PCR or DGGE.
However, the T-RFLP profile also in-
cluded several small peaks that abutted
larger peaks, which, in practice, may
not be finely differentiated. Another
consideration for T-RFLP profiles and
the other techniques is that interpreta-
tion often involves a cutoff for peaks
that fall below a given intensity thresh-
old. For this hypothetical analysis, no
threshold was set, although several
peaks were present at very low fre-
quency relative to the larger peaks. In-
cluding such a cutoff would have had
the effect of omitting signals from
some of the heterogeneous operons of a
single organism and from low copy
number organisms. This would have in-
fluenced the calculation of the diversi-
ty indices and resulted in a lower esti-
mates of diversity. According to the
Shannon-Weaver diversity index calcu-
lations, T-RFLP comes close to approx-
imating the actual diversity of the hy-
pothetical community. However, it
should be emphasized that, in this case
and for all techniques described here,
the bias of overlapping fragments di-
rectly offsets the bias of multiple sig-
nals by a single organism. Thus, two
compensating errors do not necessarily
yield a correct answer.
A potential limitation of this study is
the fact that the CMR database current-
ly emphasizes medically relevant or-
ganisms. These organisms may be sys-
tematically different in their copy
number compared to environmental
isolates, although comparison with the
less medically oriented rrndb suggests
that the copy number distributions are
quite similar. The current size of the
CMR database and the restrictions for
inclusion in the hypothetical communi-
ty also limit the scope of this analysis.
However, the behavior of the Shannon-
Weaver diversity index with respect to
richness (for communities of uniform
abundance and therefore an evenness
value of 1.0) lends some insight into
the value of this small subset of organ-
isms (Figure 3). At low diversity, the
addition of a single population has a
large impact on the diversity index val-
ue because the proportion of the new
population is relatively large compared
to the total number of populations. As
the community grows, the impact of
each additional population becomes
smaller and smaller. Given the observa-
tion that the diversity index changes
most abruptly for communities of less
than 20 organisms, it would appear that
a hypothetical community of 41 organ-
isms is sufficient to make meaningful
conclusions regarding the analysis
techniques described here. (Note that
the various diversity indices used in
this analysis may not be appropriate for
“real” 16S rDNA-based fragment
analyses due to biases inherent in PCR
amplification, fragment discrimination,
and operon copy number issues.) Addi-
tions to the database and updated hypo-
thetical analyses will determine
whether these trends remain consistent.
It is instructive to observe the his-
tograms generated by each technique
for the hypothetical community. All or-
ganisms were equally represented on
the basis of population densities, but
the histograms show a wide variation in
the ribotype abundance and diversity.
This illustrates how peak amplitude can
be deceptive in community analysis.
Changes in the height of a particular
peak can be caused by the growth or
loss of a single population, while subtle
changes in smaller peaks are over-
looked or discounted. Understanding
the rrn distribution for organisms in a
particular environment would improve
the application and interpretation of
molecular analyses. Recent studies
(11,26) suggest that variation in the
copy number of the rRNA genes is re-
lated to the ecological strategy of an or-
ganism. That is, organisms with multi-
ple copies of the rrn operon are able to
mobilize quickly in response to rich
growth conditions. Organisms with
fewer rrn operons are more limited in
their rate of ribosome synthesis and
mobilize less quickly to an influx of nu-
trient into an environment. How does
the dynamic nutrient profile of an envi-
ronment shape the composition and, in
turn, function of the microbial commu-
nity? It may be that organisms of low
rrn operon copy number comprise a
significant portion of the microbial di-
versity, while high copy number organ-
isms flourish during nutrient perturba-
tions. These “fast responders” with
high copy number are the same kinds
of organisms that promptly appear on
culture plates under traditional cultur-
ing methods, overshadowing the organ-
isms that grow more slowly. Note that
molecular-based tools were developed
in part to avoid the bias of culture-
based techniques, while the results of
this study suggest that 16S rDNA-
based molecular techniques may
overemphasize the same organisms.
Another point to consider is the term
ribotype, which is often meant to con-
vey the sequence similarity of the 16S
rRNA genes between two organisms.
Ribotyping is used to describe the
unique banding patterns of the rRNA
gene using various methods of discrim-
ination (restriction fragments, length
heterogeneity, etc.) Two organisms are
said to have common ribotypes when
they give identical signals for a given
technique. This hypothetical analysis
demonstrates that while restriction
fragment sites and gene fragment
lengths may be in common for one
technique, minute differences in the
DNA sequence may yield divergent re-
sponses for another technique. Thus, it
should be made clear that the concept
of ribotype as a measure for diversity is
entirely technique dependent.
This study provides an initial explo-
ration of rrn operon copy number bias,
based on the content of the databases
available to date. Further investigation
may lead to the refinement of existing
methods and/or the development of
correction factors for improved esti-
mates of community diversity. In the
meantime, method development should
be directed toward technologies that
are based on single copy genes and/or
new discrimination methods. Until
these new techniques are readily avail-
able and broadly applicable, re-
searchers should continue to interpret
rrn-based techniques with caution.
ACKNOWLEDGMENTS
Support for this work was provided
by a National Institutes of Health
Training Grant in Biotechnology (no.
T32GM008412) and a NASA Graduate
Research Report
8 BioTechniques Vol. 34, No. 4 (2003)
Student Researchers Program Fellow-
ship (no. NGT-10-52619) to L.D.C. and
by project no. DE-FG03-00ER63046-
A001 from the U.S. Department of En-
ergy NABIR program.
REFERENCES
1.Torsvik, V., F.L. Daae, R.A. Sandaa, and L.
Ovreas. 1998. Novel techniques for analyzing
microbial diversity in natural and perturbed
environments. J. Biotechnol. 64:53-62.
2.Torsvik, V., J. Goksoyr, and F.L. Daae.
1990. High diversity in DNA of soil bacteria.
Appl. Environ. Microbiol. 56:782-787.
3.Amann, R.I., W. Ludwig, and K.H.
Schleifer. 1995. Phylogenetic identification
and in situ detection of individual microbial
cells without cultivation. Microbiol. Rev.
59:143-169.
4.Giovannoni, S.J., T.B. Britschgi, C.L. Moy-
er, and K.G. Field. 1990. Genetic diversity in
Sargasso Sea bacterioplankton. Nature
345:60-63.
5.Ward, D.M., R. Weller, and M.M. Bateson.
1990. 16S rRNA sequences reveal numerous
uncultured microorganisms in a natural com-
munity. Nature 345:63-65.
6.Dahllof, I., H. Baillie, and S. Kjelleberg.
2000. rpoB-based microbial community
analysis avoids limitations inherent in 16S
rRNA gene intraspecies heterogeneity. Appl.
Environ. Microbiol. 66:3376-3380.
7.Dahllof, I. 2002. Molecular community analy-
sis of microbial diversity. Curr. Opin. Biotech-
nol. 13:213-217.
8.Fisher, M.M. and E.W. Triplett. 1999. Auto-
mated approach for ribosomal intergenic spac-
er analysis of microbial diversity and its appli-
cation to freshwater bacterial communities.
Appl. Environ. Microbiol. 65:4630-4636.
9.Liu, W.T., T.L. Marsh, H. Cheng, and L.J.
Forney. 1997. Characterization of microbial
diversity by determining terminal restriction
fragment length polymorphisms of genes en-
coding 16S rRNA. Appl. Environ. Microbiol.
63:4516-4522.
10.Farrelly, V., F.A. Rainey, and E. Stacke-
brandt. 1995. Effect of genome size and rrn
gene copy number on PCR amplification of
16S rRNA genes from a mixture of bacterial
species. Appl. Environ. Microbiol. 61:2798-
2801.
11.Klappenbach, J.A., J.M. Dunbar, and T.
Schmidt. 2000. rRNA operon copy number
reflects ecological strategies of bacteria. Appl.
Environ. Microbiol. 66:1328-1333.
12.Stackebrandt, E. 2002. Defining taxonomic
ranks. In The Prokaryotes: an Evolving Elec-
tronic Resource for the Microbiological Com-
munity. (Online reference.) Title No. 10125.
13.Barbieri, M. 1981. The ribotype theory on
the origin of life. J. Theor. Biol. 91:545-601.
14.Frostegard, A., S. Courtois, V. Ramisse, S.
Clerc, D. Bernillon, F. Le Gall, P. Jeannin,
X. Nesme, et al. 1999. Quantification of bias
related to the extraction of DNA directly from
soils. Appl. Environ. Microbiol. 65:5409-
5420.
15.Martin-Laurent, F., L. Philippot, S. Hallet,
R. Chaussod, J.C. Germon, G. Soulas, and
G. Catroux. 2001. DNA Extraction from
soils: old bias for new microbial diversity
analysis methods. Appl. Environ. Microbiol.
67:2354-2359.
16.Miller, D.N., J.E. Bryant, E.L. Madsen, and
W.C. Ghiorse. 1999. Evaluation and opti-
mization of DNA extraction and purification
procedures for soil and sediment samples.
Appl. Environ. Microbiol. 65:4715-4724.
17.Polz, M.F. and C.M. Cavanaugh. 1998. Bias
in template-to-product ratios in multi-template
PCR. Appl. Environ. Microbiol. 64:3724-
3730.
18.Steffan, R.J., J. Goksoyr, A.K. Bej, and
R.M. Atlas. 1988. Recovery of DNA from
soils and sediments. Appl. Environ. Microbi-
ol. 54:2908-2915.
19.Wilson, I.G. 1997. Inhibition and facilitation
of nucleic acid amplification. Appl. Environ.
Microbiol. 63:3741-3751.
20.Suzuki, M., M.S. Rappe, and S.J. Giovan-
noni. 1998. Kinetic bias in estimates of
coastal picoplankton community structure ob-
tained by measurements of small-subunit
rRNA gene PCR amplicon length heterogene-
ity. Appl. Environ. Microbiol. 64:4522-4529.
21.Suzuki, M.T. and S.J. Giovannoni. 1996.
Bias caused by template annealing in the am-
plification of mixtures of 16S rRNA genes by
PCR. Appl. Environ. Microbiol. 62:625-630.
22.Ritchie, N.J., M.E. Schutter, R.P. Dick, and
D.D. Myrold. 2000. Use of length hetero-
geneity PCR and fatty acid methyl ester pro-
files to characterize microbial communities in
soil. Appl. Environ. Microbiol. 66:1668-1675.
23.Muyzer, G., E.C. de Waal, and A.G. Uitter-
linden. 1993. Profiling of complex microbial
populations by denaturing gradient gel elec-
trophoresis analysis of polymerase chain reac-
tion-amplified genes coding for 16S rRNA.
Appl. Environ. Microbiol. 59:695-700.
24.Klappenbach, J.A., P.R. Saxman, J.R. Cole,
and T.M. Schmidt. 2001. rrndb: the riboso-
mal RNA operon copy number database. Nu-
cleic Acids Res. 29:181-184.
25.Gurtler, V. and V.A. Stanisich. 1996. New
approaches to typing and identification of bac-
teria using the 16S-23S rDNA spacer region.
Microbiology 142:3-16.
26.Fogel, G.B., C.R. Collins, J. Li, and C.F.
Brunk. 1999. Prokaryotic genome size and
SSU rDNA copy number: estimation of mi-
crobial relative abundance from a mixed pop-
ulation. Microb. Ecol. 38:93-113.
Received 7 November 2002; accepted
14 January 2003.
Address correspondence to:
Dr. Craig S. Criddle
Department of Civil and Environmental
Engineering
Terman Engineering Center, Rm B-9
Stanford University
Stanford, CA 94305, USA
e-mail: criddle@stanford.edu
Vol. 34, No. 4 (2003) BioTechniques 9
... However, despite this significant advancement, the technique had a key limitation: it could not provide insights into the potential functional roles of microorganisms, as it relied solely on a single ribosomal gene sequence rather than a complete genome. In addition, the use of only the 16S rRNA sequence gave rise to different concerns such as the lack of phylogenetic resolution to resolve the deepest nodes [16,17], the presence of multiple heterogeneous copies of the gene within a given genome [18][19][20], and the formation of chimeric PCR amplification products from complex environmental samples [21]. Therefore, to overcome these issues, it is advisable to use additional gene markers [22]. ...
Article
Full-text available
Metagenome-assembled genomes (MAGs) have revolutionized microbial ecology by enabling the genome-resolved study of uncultured microorganisms directly from environmental samples. By leveraging high-throughput sequencing, advanced assembly algorithms, and genome binning techniques, researchers can reconstruct microbial genomes without the need for cultivation. These methodological advances have expanded the known microbial diversity, revealing novel taxa and metabolic pathways involved in key biogeochemical cycles, including carbon, nitrogen, and sulfur transformations. MAG-based studies have identified microbial lineages form Archaea and Bacteria responsible for methane oxidation, carbon sequestration in marine sediments, ammonia oxidation, and sulfur metabolism, highlighting their critical roles in ecosystem stability. From a sustainability perspective, MAGs provide essential insights for climate change mitigation, sustainable agriculture, and bioremediation. The ability to characterize microbial communities in diverse environments, including soil, aquatic ecosystems, and extreme habitats, enhances biodiversity conservation and supports the development of microbial-based environmental management strategies. Despite these advancements, challenges such as assembly biases, incomplete metabolic reconstructions, and taxonomic uncertainties persist. Continued improvements in sequencing technologies, hybrid assembly approaches, and multi-omics integration will further refine MAG-based analyses. As methodologies advance, MAGs will remain a cornerstone for understanding microbial contributions to global biogeochemical processes and developing sustainable interventions for environmental resilience.
... In our study, in all horizons, Acidobacteriota were significantly increased in microsites compared with standard mixed 250 mg samples (data not shown). The phylum Acidobacteriota is characterized by low 16S rRNA gene copy number (Stoddard et al., 2015;Větrovský & Baldrian, 2013) and higher GC content (Mann & Chen, 2010), which complies with the observation that proportions of organisms with low number of copies of the amplified gene (Crosby & Criddle, 2003;Farrelly et al., 1995) and high GC content (Benjamini & Speed, 2012;Dabney & Meyer, 2012;Laursen et al., 2017) are usually underestimated by PCR. Thus, it is suggested that Acidobacteriota might be systematically underestimated in standard soil samples. ...
Article
Full-text available
Microbial communities were studied in redoximorphic microsites of highly heterogeneous Gleysol at a mm scale using 16S and 18S amplicon sequencing to demonstrate if the composition of soil microbes reflects the differences in ferric and ferrous micro‐sites. In both explored gley horizons with redoximorphic features (Bg2 and Cg), ferric mottles were significantly enriched with total P and Fe and depleted of O, Si, Al, K and Ca compared with the adjacent ferrous groundmass (SEM–EDS). Ferric mottles were determined as Fe oxide coatings and hypocoatings. In Bg2, both prokaryotic and micro‐eukaryotic communities differed significantly between mottles and groundmass in composition of operational taxonomic units (OTUs) and in proportions of phyla, reflecting heterogeneities in the soil properties there. Mottles in Bg2 were characterized by increased proportion of Proteobacteria , decreased proportion of Acidobacteriota among prokaryotes and by dominance of a single proteobacterial OTU from Anaplasmataceae compared to all other samples. The composition of micro‐eukaryotes showed an opposite trend, as micro‐eukaryotes of Bg2 groundmass were unique among the other horizons, while micro‐eukaryotes of Bg2 mottles had similar composition to neighbouring horizons. Microbial communities of adjacent samples were not more similar to each other than communities of randomly selected ones in Bg2 horizon. That suggests that at mm scale, the sample distance does not represent the driving factor of microbial community composition and that the adjacent samples differ rather due to physicochemical factors. The spatial organization of microbial communities revealed in Bg2 has not reappeared in similarly organized Cg horizon, probably due to other overriding factors. The differences revealed between Bg2 and Cg horizons, including granulometric composition, content of crystalline Fe, exchangeable Al, and organic carbon, as well as exposition to groundwater, were discussed as possible reasons of the distinct organization in Cg. The similarity of pro−/eukaryotic communities of adjacent and non‐adjacent couples suggests no distance decay pattern at a mm scale. The agreement between patchiness in soil properties and microbial communities was revealed for the first time and confirms the importance of microscale patterns in soil.
... Universal primers, which target the broad range of conserved regions of 16S rDNA, do not have 100% bacterial coverage, and there is considerable variation in their coverage range [34]. At the species level among bacterial groups, the copy number of the 16S gene varies between 1 and 15, which is a potential major drawback [35][36][37]. Copy number variation is present even at the strain level of many species of bacteria [10]. This results in the potential over-or underestimation of the genetic content and provides a challenge when enumerating bacteria, specifically in mixed populations. ...
Article
Full-text available
The wide diversity of microbiota at the genera and species levels across sites and individuals is related to various causes and the observed differences between individuals. Efforts are underway to further understand and characterize the human-associated microbiota and its microbiome. Using 16S rDNA as a genetic marker for bacterial identification improved the detection and profiling of qualitative and quantitative changes within a bacterial population. In this light, this review provides a comprehensive overview of the basic concepts and clinical applications of the respiratory microbiome, alongside an in-depth explanation of the molecular targets and the potential relationship between the respiratory microbiome and respiratory disease pathogenesis. The paucity of robust evidence supporting the correlation between the respiratory microbiome and disease pathogenesis is currently the main challenge for not considering the microbiome as a novel druggable target for therapeutic intervention. Therefore, further studies are needed, especially prospective studies, to identify other drivers of microbiome diversity and to better understand the changes in the lung microbiome along with the potential association with disease and medications. Thus, finding a therapeutic target and unfolding its clinical significance would be crucial.
... In summary, based on the 16S rRNA gene diversity data, assuming all the limitations of this approach (Crosby and Criddle, 2003;Soergel et al., 2012;Ghyselinck et al., 2013;Rubio-Portillo et al., 2016) and the high proportion of OTUs related to uncultured microbes in the analyzed samples, the most frequently retrieved sequences corresponded to microorganisms related to S and C cycles, as expected for marine sediments (Plugge et al., 2011;Urich et al., 2014;Fike et al., 2015). ...
Article
Full-text available
Coastal marine lagoons are environments highly vulnerable to anthropogenic pressures such as agriculture nutrient loading or runoff from metalliferous mining. Sediment microorganisms, which are key components in the biogeochemical cycles, can help attenuate these impacts by accumulating nutrients and pollutants. The Mar Menor, located in the southeast of Spain, is an example of a coastal lagoon strongly altered by anthropic pressures, but the microbial community inhabiting its sediments remains unknown. Here, we describe the sediment prokaryotic communities along a wide range of environmental conditions in the lagoon, revealing that microbial communities were highly heterogeneous among stations, although a core microbiome was detected. The microbiota was dominated by Delta- and Gammaproteobacteria and members of the Bacteroidia class. Additionally, several uncultured groups such as Asgardarchaeota were detected in relatively high proportions. Sediment texture, the presence of Caulerpa or Cymodocea, depth, and geographic location were among the most important factors structuring microbial assemblages. Furthermore, microbial communities in the stations with the highest concentrations of potentially toxic elements (Fe, Pb, As, Zn, and Cd) were less stable than those in the non-contaminated stations. This finding suggests that bacteria colonizing heavily contaminated stations are specialists sensitive to change.
... Furthermore, the sequencing of 16SrRNA genes could identify an organism by reconstructing its phylogeny, along with the possibility of storing sequences in databases, resulting in the rapid adoption of the 16SrRNA gene by microbiologists. It is also created by multiple heterogeneous copies of the 16SrRNA gene within a genome (Dahllöf et al., 2000;Crosby and Criddle, 2003). Some studies have identified organisms with identical 16SrRNA gene sequences with significant sequence divergence in protein-encoding genes (Pernthaler and Pernthaler, 2005). ...
Article
Full-text available
Pseudomonas fluorescens is one of the main causes of septicemic diseases among freshwater fish, causing severe economic losses and decreasing farm efficiency. Thus, this research was aimed to investigate the occurrence of P. fluorescens in Nile Tilapia (O. niloticus) fish in Egypt, gene sequencing of 16SrDNA gene, and antimicrobial susceptibility. P. fluorescens strains were detected in 32% (128\400) of apparently healthy (9%; 36\400) and diseased (23%; 92\400) Nile tilapia fish. The highest prevalence was observed in gills of fish, 31.3% followed by intestine 26.9%, liver 24.2%, and kidneys 17.6%. The PCR results for the 16SrDNA gene of P. fluorescens showed 16SrDNA gene in 30% of examined isolates. Moreover, Homogeny and a strong relationship between strains of P. fluorescens was confirmed using 16SrDNA sequences. Beside the responsibility of 16SrDNA gene on the virulence of P. fluorescens. The results of antimicrobial susceptibility tests revealed that all strains were resistant to piperacillin (100%), followed by ceftazidime (29.7%), and cefepime (25.8%). The strains of P. fluorescence were highly sensitive to cefotaxime (74.2%), followed by ceftriaxone and levofloxacin (70.3% each). Interestingly, 29.7% of strains of P. fluorescens were multiple antimicrobial-resistant (MAR).
... Our absolute abundance method combines OD600 measurements and 16S rRNA gene sequencing to determine the absolute abundance of each species in multispecies communities. Biases in genome extraction efficiency, 16S rRNA gene copy number, and PCR amplification can impact measurements based on 16S rRNA gene sequencing (Crosby & Criddle, 2003;Laursen et al, 2017;Lim et al, 2018). We tested for potential bias in our workflow by measuring the relative abundance of mixed cultures containing 10% C. difficile based on OD600 measurements (Appendix Fig S7). ...
Article
Full-text available
Understanding the principles of colonization resistance of the gut microbiome to the pathogen Clostridioides difficile will enable the design of defined bacterial therapeutics. We investigate the ecological principles of community resistance to C. difficile using a synthetic human gut microbiome. Using a dynamic computational model, we demonstrate that C. difficile receives the largest number and magnitude of incoming negative interactions. Our results show that C. difficile is in a unique class of species that display a strong negative dependence between growth and species richness. We identify molecular mechanisms of inhibition including acidification of the environment and competition over resources. We demonstrate that Clostridium hiranonis strongly inhibits C. difficile partially via resource competition. Increasing the initial density of C. difficile can increase its abundance in the assembled community, but community context determines the maximum achievable C. difficile abundance. Our work suggests that the C. difficile inhibitory potential of defined bacterial therapeutics can be optimized by designing communities featuring a combination of mechanisms including species richness, environment acidification, and resource competition.
... Knowing the rRNA gene copy number and its variability in and among single cells of eukaryotes is important for rRNA gene-based surveys (Farrelly et al., 1995;Crosby and Criddle, 2003;Gribble and Anderson, 2007;Amaral-Zettler et al., 2011). For example, using rRNA gene as a bio-marker in high-throughput metabarcoding sequencing has been employed in numerous investigations on the microbial diversity in a wide variety of systems (e.g. ...
Article
Dinoflagellates are an ecologically important group of protists in aquatic environment and have evolved many unusual and enigmatic genomic features such as immense genome sizes, high repeated genes, and a large portion of hydroxymethyluracil in DNA. Although previous studies have observed positive correlations between the large subunit (LSU) rRNA gene copy number and genome size of a variety of eukaryotic organisms (e.g. higher plants and animals), or between cell volume and LSU rRNA gene copy number, and/or between genome size and cell size, which suggests a possible co-evolution among these three features in different lineages of life, it remains an open question regarding the relationships among these three parameters in dinoflagellates. For the first time, we estimated the copy numbers of the LSU rRNA gene, the genome sizes, and cell volumes within a broad range of dinoflagellates (covering 15 species of 11 genera) using single-cell qPCR-based assay (determining LSU rRNA gene copy number), FlowCAM (cell volume measurement), and ultraviolet spectrophotometry (genome size estimation). The measured copy number of LSU rRNA gene ranged from 398 ± 184 (Prorocentrum minimum) to 152,078 ± 33,555 copies·cell-1 (Alexandrium pacificum), while the genome size and the cell volume ranged from 5.6 ± 0.2 (Karlodinium veneficum) to 853 ± 19.9 pg·cell-1 (Pseliodinium pirum), and from 1,070 ± 225 (Kar. veneficum) to 168,474 ± 124,180 μm3 (Ps. pirum), respectively. Together with the three parameters measured in literature, there are significant positive linear correlations between LSU rRNA gene copy numbers and genome sizes, cell volumes and LSU rRNA gene copy numbers, and between genome sizes and cell volumes via comparisons of multi-model regression analyses, suggesting a dependence of genome size and rRNA gene copy number on the cell volumes of dinoflagellates. Validation of the measurement methods was conducted via comparisons between reported data in the literature and that predicted using the linear equations we obtained, and between genome size measured by flow cytometry (FCM) and ultraviolet spectrophotometry (Nanodrop). These results provide insightful understandings of dinoflagellate evolution in terms of the relationships among genomes, gene copy number, and cell volume, and of rRNA gene-based studies in intra-populational and intra-individual genetic diversity, taxonomy, and diversity assessment in the environment of dinoflagellates. The results also provide a dataset useful for reads calibration in environmental metabarcoding studies of dinoflagellates and selection of candidate species for whole genome sequencing.
Article
Recent developments in molecular biotechnology have introduced a variety of advanced techniques for examining the microbial ecosystems involved in food and beverage fermentations. Techniques such as denaturing gradient gel electrophoresis (DGGE), terminal restriction fragment length polymorphism (T-RFLP), fluorescent in situ hybridization (FISH), clone library construction, and quantitative PCR (qPCR) offer sensitive and reliable methods for analyzing microbial communities. These molecular approaches present significant advantages over traditional culture-based methods. Beyond their value in fermentation research, many of these tools also hold promise for rapid quality control in the beverage industry. Moreover, the growing availability of next-generation sequencing platforms, including Illumina and 454 sequencing systems, is making high-resolution microbial analysis more accessible to researchers focused on food and fermentation science. These technologies allow for detailed insights into microbial diversity and composition, enhancing our understanding and management of complex fermentation processes and hygiene practices. This review highlights the currently available molecular techniques for microbial community profiling, discusses their relevance to fermentation research and industrial applications, and explores future directions in microbial analysis for beer and wine production
Thesis
Full-text available
Microbial fuel cells (MFCs) are potential systems for renewable energy production from biomass and biomass-derived wastes. A microbial assemblage transfers electrons from reduced compounds to an anode as the electron acceptor, and these electrons then pass through a circuit and combine with protons and a terminal electron acceptor at the cathode. Recently several research groups have characterized the electrochemically active anode communities to potentially improve MFC performance. The results from various systems have shown very different and often diverse microbial communities. This research tested the hypothesis that identical MFC design and operation conditions yield reproducible performance and microbial consortia. Three sets of triplicate batch H-shape MFCs differing in energy sources (acetate, glucose, or lactate) were constructed to test reproducibility of anode bacterial community and performance. Anodes were constructed of carbon paper, inoculated with anaerobic sludge, and connected by 970 Ω resistance to the cathode. Controls included acetate-, lactate-, and glucose-fed MFCs operated in open-circuit mode. DNA was extracted on several sampling events from each anode biofilm and tested by denaturing gradient gel electrophoresis (DGGE) and sequencing of dominant bands for time- and reactor-variable community compositions. Monitoring of performance included analysis of electrochemical data, liquid samples for chemical oxygen demand and gas chromatography analyses, and gas headspace samples. The anode bacterial communities and performance of MFCs were reproducible for a given substrate and reactor configuration according to the community profiles generated by 16S rRNA gene-targeting PCR-DGGE and analysis of chemical and electrochemical data. Regardless of carbon source, all anode communities contained sequences closely affiliated with G. sulfurreducens (>99% similarity) and an uncultured bacterium clone in the Bacteroidaceae family (99% similarity). Firmicutes were found only in glucose-fed MFCs, presumably serving the role of converting complex carbon into simple molecules such as acetate and scavenging oxygen. Bands representing putative iron-reducing bacteria with the same melting domain were detected from anode communities from a triplicate set. The MFC performance was found to be reproducible with small divergence among triplicate reactors. Relatively small divergence to the average performance was detected. The Monod equation was modified to describe current density versus the inverse of external resistance, which describes the availability of the insoluble anode electron acceptor. MFCs with low internal resistance had a high maximum current density. The open-circuit control reactors initiated substantial power generation as soon as the circuits were connected, after substantial COD consumption in an open-circuit environment. Enrichment of G. sulfurreducens on anode biofilms became evident after current generation in the controls. The selective enrichment of an anode-attached community by medium changes led to considerable carbon dioxide accumulation and a decreasing molar ratio of methane to carbon dioxide in the headspace with successive stages. Total amounts of gas produced in open-circuit MFCs were much less than those of closed-circuit MFCs.
Article
Full-text available
Marine bacterioplankton diversity was examined by quantifying natural length variation in the 5' domain of small-subunit (SSU) rRNA genes (rDNA) amplified by PCR from a DNA sample from the Oregon coast. This new technique, length heterogeneity analysis by PCR (LH-PCR), determines the relative proportions of amplicons originating from different organisms by measuring the fluorescence emission of a labeled primer used in the amplification reaction. Relationships between the sizes of amplicons and gene phylogeny were predicted by an analysis of 366 SSU rDNA sequences from cultivated marine bacteria and from bacterial genes cloned directly from environmental samples. LH-PCR was used to compare the distribution of bacterioplankton SSU rDNAs from a coastal water sample with that of an SSU rDNA clone library prepared from the same sample and also to examine the distribution of genes in the PCR products from which the clone library was prepared. The analysis revealed that the relative frequencies of genes amplified from natural communities are highly reproducible for replicate sets of PCRs but that a bias possibly caused by the reannealing kinetics of product molecules can skew gene frequencies when PCR product concentrations exceed threshold values.
Article
Full-text available
The Ribosomal RNA Operon Copy Number Database (rrndb) is an Internet-accessible database containing annotated information on rRNA operon copy number among prokaryotes. Gene redundancy is uncommon in prokaryotic genomes, yet the rRNA genes can vary from one to as many as 15 copies. Despite the widespread use of 16S rRNA gene sequences for identification of prokaryotes, information on the number and sequence of individual rRNA genes in a genome is not readily accessible. In an attempt to understand the evolutionary implications of rRNA operon redundancy, we have created a phylogenetically arranged report on rRNA gene copy number for a diverse collection of prokaryotic microorganisms. Each entry (organism) in the rrndb contains detailed information linked directly to external websites including the Ribosomal Database Project, GenBank, PubMed and several culture collections. Data contained in the rrndb will be valuable to researchers investigating microbial ecology and evolution using 16S rRNA gene sequences. The rrndb web site is directly accessible on the WWW at http://rrndb.cme.msu.edu.
Article
Full-text available
Soil bacterium DNA was isolated by minor modifications of previously described methods. After purification on hydroxyapatite and precipitation with cetylpyridinium bromide, the DNA was sheared in a French press to give fragments with an average molecular mass of 420,000 daltons. After repeated hydroxyapatite purification and precipitation with cetylpyridinium bromide, high-pressure liquid chromatography analysis showed the presence of 2.1% RNA or less, whereas 5-methylcytosine made up 2.9% of the total deoxycytidine content. No other unusual bases could be detected. The hyperchromicity was 31 to 36%, and the melting curve in 1 X SSC (0.15 M NaCl plus 0.015 M sodium citrate) corresponded to 58.3 mol% G+C. High-pressure liquid chromatography analysis of two DNA samples gave 58.6 and 60.8 mol% G+C. The heterogeneity of the DNA was determined by reassociation of single-stranded DNA, measured spectrophotometrically. Owing to the high complexity of the DNA, the reassociation had to be carried out in 6 X SSC with 30% dimethyl sulfoxide added. Cuvettes with a 1-mm light path were used, and the A275 was read. DNA concentrations as high as 950 micrograms ml-1 could be used, and the reassociation rate of Escherichia coli DNA was increased about 4.3-fold compared with standard conditions. C0t1/2 values were determined relative to that for E. coli DNA, whereas calf thymus DNA was reassociated for comparison. Our results show that the major part of DNA isolated from the bacterial fraction of soil is very heterogeneous, with a C0t1/2 about 4,600, corresponding to about 4,000 completely different genomes of standard soil bacteria.(ABSTRACT TRUNCATED AT 250 WORDS)
Article
Full-text available
Experiments were performed to evaluate the effectiveness of two different methodological approaches for recovering DNA from soil and sediment bacterial communities: cell extraction followed by lysis and DNA recovery (cell extraction method) versus direct cell lysis and alkaline extraction to recover DNA (direct lysis method). Efficiency of DNA recovery by each method was determined by spectrophotometric absorbance and using a tritiated thymidine tracer. With both procedures, the use of polyvinylpolypyrrolidone was important for the removal of humic compounds to improve the purity of the recovered DNA; without extensive purification, various restriction enzymes failed to cut added target DNA. Milligram quantities of high-purity DNA were recovered from 100-g samples of both soils and sediments by the direct lysis method, which was a greater than 1-order-of-magnitude-higher yield than by the cell extraction method. The ratio of labeled thymidine to total DNA, however, was higher in the DNA recovered by the cell extraction method. than by the direct lysis method, suggesting that the DNA recovered by the cell extraction method came primarily from active bacterial cells, whereas that recovered by the direct lysis method may have contained DNA from other sources.
Article
The frequent discrepancy between direct microscopic counts and numbers of culturable bacteria from environmental samples is just one of several indications that we currently know only a minor part of the diversity of microorganisms in nature. A combination of direct retrieval of rRNA sequences and whole-cell oligonucleotide probing can be used to detect specific rRNA sequences of uncultured bacteria in natural samples and to microscopically identify individual cells. Studies have been performed with microbial assemblages of various complexities ranging from simple two-component bacterial endosymbiotic associations to multispecies enrichments containing magnetotactic bacteria to highly complex marine and soil communities. Phylogenetic analysis of the retrieved rRNA sequence of an uncultured microorganism reveals its closest culturable relatives and may, together with information on the physicochemical conditions of its natural habitat, facilitate more directed cultivation attempts. For the analysis of complex communities such as multispecies biofilms and activated-sludge flocs, a different approach has proven advantageous. Sets of probes specific to different taxonomic levels are applied consecutively beginning with the more general and ending with the more specific (a hierarchical top-to-bottom approach), thereby generating increasingly precise information on the structure of the community. Not only do rRNA-targeted whole-cell hybridizations yield data on cell morphology, specific cell counts, and in situ distributions of defined phylogenetic groups, but also the strength of the hybridization signal reflects the cellular rRNA content of individual cells. From the signal strength conferred by a specific probe, in situ growth rates and activities of individual cells might be estimated for known species. In many ecosystems, low cellular rRNA content and/or limited cell permeability, combined with background fluorescence, hinders in situ identification of autochthonous populations. Approaches to circumvent these problems are discussed in detail.
Article
Determination of the relative abundance of a specific prokaryote in an environmental sample is of major interest in applied and environmental microbiology. Relative abundance can be calculated using knowledge of SSU rDNA copy number, amount of SSU rDNA in the sample, and a weighted average estimate of the genome sizes for organisms in the original sample. By surveying the literature, we provide estimates of genome size and SSU rDNA copy number for 303 and 101 prokaryotes, respectively. This compilation can be used to make reasonable estimates for a wide range of organisms in the calculation of relative abundance. A statistical analysis suggests that no correlation exists between genome size and SSU rDNA copy number. A phylogenetic analysis is used to offer insights into the evolution of both genome size and SSU rDNA copy number.
Article
Molecular techniques were applied for analysing the entire bacterial community, including both the cultivated and non-cultivated part of the community. DNA was extracted from samples of soils and sediments, and a combination of different molecular methods were used to investigate community structure and diversity in these environments. Reassociation of sheared and thermally denatured DNA in solution was used to measure the total genetical diversity. PCR-denaturing gradient gel electrophoresis (DGGE) analysis of rRNA genes gave information about changes in the numerically dominating bacterial populations. Hybridisation with phylogenetic group specific probes, and sequencing provided information about the affiliation of the bacterial populations. Using DNA reassociation analysis we demonstrated that bacterial communities in pristine soil and sediments may contain more than 10 000 different bacterial types. The diversity of the total soil community was at least 200 times higher than the diversity of bacterial isolates from the same soil. This indicates that the culturing conditions select for a distinct subpopulation of the bacteria present in the environment. Molecular methods were applied to monitor the effects of perturbations due to antropogenic activities and pollution on microbial communities. Our investigations show that agricultural management, fish farming and pollution may lead to profound changes in the community structure and a reduction in the bacterial diversity.
Article
Microbiologists have been constrained in their efforts to describe the compositions of natural microbial communities using traditional methods. Few microorganisms have sufficiently distinctive morphology to be recognized by microscopy. Culture-dependent methods are biased, as a microorganism can be cultivated only after its physiological niche is perceived and duplicated experimentally. It is therefore widely believed that fewer than 20% of the extant microorganisms have been discovered, and that culture methods are inadequate for studying microbial community composition. In view of the physiological and phylogenetic diversity among microorganisms, speculation that 80% or more of microbes remain undiscovered raises the question of how well we know the Earth's biota and its biochemical potential. We have performed a culture-independent analysis of the composition of a well-studied hot spring microbial community, using a common but distinctive cellular component, 16S ribosomal RNA. Our results confirm speculations about the diversity of uncultured microorganisms it contains.
Article
Bacterioplankton are recognized as important agents of biogeochemical change in marine ecosystems, yet relatively little is known about the species that make up these communities. Uncertainties about the genetic structure and diversity of natural bacterioplankton populations stem from the traditional difficulties associated with microbial cultivation techniques. Discrepancies between direct counts and plate counts are typically several orders of magnitude, raising doubts as to whether cultivated marine bacteria are actually representative of dominant planktonic species. We have phylogenetically analysed clone libraries of eubacterial 16S ribosomal RNA genes amplified from natural populations of Sargasso Sea picoplankton by the polymerase chain reaction. The analysis indicates the presence of a novel microbial group, the SAR11 cluster, which appears to be a significant component of this oligotrophic bacterioplankton community. A second cluster of lineages related to the oxygenic phototrophs--cyanobacteria, prochlorophytes and chloroplasts--was also observed. However, none of the genes matched the small subunit rRNA sequences of cultivated marine cyanobacteria from similar habitats. The diversity of 16S rRNA genes observed within the clusters suggests that these bacterioplankton may be consortia of independent lineages sharing surprisingly distant common ancestors.