PreprintPDF Available

Metagenomic characterization of soil microbial communities in the Luquillo experimental forest (Puerto Rico) and implications for nitrogen cycling

Authors:

Abstract and Figures

The phylogenetic and functional diversity of microbial communities in tropical rainforests, and how these differ from temperate communities remain poorly described but are directly related to the increased fluxes of greenhouse gases such as nitrous oxide (N 2 O) from the tropics. Towards closing these knowledge gaps, we analyzed replicated shotgun metagenomes representing distinct life zones from four locations in the Luquillo Experimental Forest (LEF), Puerto Rico. These soils had a distinct microbial community composition and lower species diversity when compared to temperate grasslands or agricultural soils. Unlike temperate soils, LEF soils showed little stratification with depth in the first 0-30cm, with ~45% of community composition differences explained solely by location. The relative abundances and nucleotide sequences of N 2 O reductases ( nosZ ) were highly similar between tropical forest and temperate soils. However, respiratory NO reductase ( norB ) was 2-fold more abundant in the tropical soils, which might be relatable to their greater N 2 O emissions. Nitrogen fixation ( nifH ) also showed higher relative abundance in rainforest compared to temperate soils (20% vs. 0.1-0.3% of bacterial genomes in each soil type harbored the gene, respectively). Collectively, these results advance our understanding of spatial diversity and metabolic repertoire of tropical rainforest soil communities, and should facilitate future ecological modeling efforts. Importance Tropical rainforests are the largest terrestrial sinks of atmospheric CO 2 and the largest natural source of N 2 O emissions, two critical greenhouse gases for the climate. The microbial communities of rainforest soils that directly or indirectly, through affecting plant growth, contribute to these fluxes remain poorly described by cultured-independent methods. To close this knowledge gap, the present study applied shotgun metagenomics to samples selected from 3 distinct life zones within the Puerto Rico rainforest. The results advance our understanding of microbial community diversity in rainforest soils and should facilitate future studies of natural or manipulated perturbations of these critical ecosystems.
Content may be subject to copyright.
1
Metagenomic characterization of soil microbial communities in the Luquillo experimental
1
forest (Puerto Rico) and implications for nitrogen cycling
2
3
Smruthi Karthikeyan1, Luis H. Orellana1, Eric R. Johnston1, Janet K. Hatt1, Frank E. Löffler3,4,
4
Héctor L. Ayala-del-Río5, Grizelle González6, Konstantinos T Konstantinidis1,2*
5
6
1 School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta,
7
Georgia, USA
8
2 School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
9
3 Center for Environmental Biotechnology, Department of Microbiology, Department of Civil
10
and Environmental Engineering, Department of Biosystems Engineering and Soil Science,
11
University of Tennessee, Knoxville, Tennessee, USA
12
4 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
13
5 Department of Biology, University of Puerto Rico at Humacao, Puerto Rico, USA
14
6 USDA Forest Service, International Institute of Tropical Forestry, Río Piedras, Puerto Rico,
15
USA
16
17
*Address correspondence to Konstantinos Konstantinidis, kostas@ce.gatech.edu
18
School of Civil & Environmental Engineering, Georgia Institute of Technology. 311 Ferst Drive,
19
ES&T Building, Room 3321, Atlanta, GA, 30332. Telephone: 404-639-4292
20
21
The authors declare no conflict of interest.
22
23
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
2
ABSTRACT
24
25
The phylogenetic and functional diversity of microbial communities in tropical rainforests, and
26
how these differ from temperate communities remain poorly described but are directly related to
27
the increased fluxes of greenhouse gases such as nitrous oxide (N2O) from the tropics. Towards
28
closing these knowledge gaps, we analyzed replicated shotgun metagenomes representing
29
distinct life zones from four locations in the Luquillo Experimental Forest (LEF), Puerto Rico.
30
These soils had a distinct microbial community composition and lower species diversity when
31
compared to temperate grasslands or agricultural soils. Unlike temperate soils, LEF soils showed
32
little stratification with depth in the first 0-30cm, with ~45% of community composition
33
differences explained solely by location. The relative abundances and nucleotide sequences of
34
N2O reductases (nosZ) were highly similar between tropical forest and temperate soils. However,
35
respiratory NO reductase (norB) was 2-fold more abundant in the tropical soils, which might be
36
relatable to their greater N2O emissions. Nitrogen fixation (nifH) also showed higher relative
37
abundance in rainforest compared to temperate soils (20% vs. 0.1-0.3% of bacterial genomes in
38
each soil type harbored the gene, respectively). Collectively, these results advance our
39
understanding of spatial diversity and metabolic repertoire of tropical rainforest soil
40
communities, and should facilitate future ecological modeling efforts.
41
42
Importance:
43
Tropical rainforests are the largest terrestrial sinks of atmospheric CO2 and the largest natural
44
source of N2O emissions, two critical greenhouse gases for the climate. The microbial
45
communities of rainforest soils that directly or indirectly, through affecting plant growth,
46
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
3
contribute to these fluxes remain poorly described by cultured-independent methods. To close
47
this knowledge gap, the present study applied shotgun metagenomics to samples selected from 3
48
distinct life zones within the Puerto Rico rainforest. The results advance our understanding of
49
microbial community diversity in rainforest soils and should facilitate future studies of natural or
50
manipulated perturbations of these critical ecosystems.
51
52
INTRODUCTION
53
54
Soil microbiomes are one of the most complex ecosystems owing to microenvironments
55
and steep physicochemical gradients, which can change on a micrometer or millimeter scale (1-
56
3). Temporal and spatial heterogeneity, demographic stochasticity, ecotype mixing, dispersion
57
and biotic interactions are the major drivers of soil microbial diversity in these ecosystems (4, 5).
58
The formation of such “metacommunities coupled with biogeography and other edaphic factors
59
greatly influence the functional and taxonomic profile of a soil ecosystem at any given location
60
(6).
61
Tropical rainforests (forests hereafter) are characterized by humid and wet climate
62
patterns and account for a large portion of the world’s total forest cover (7). These forests have
63
high levels of primary productivity (~30% of the total global production) due to large amounts of
64
precipitation coupled with year-long warm temperatures and high levels of light (8).
65
Consequently, high levels of biodiversity are observed in these forest soils with unique microbial
66
genotypic signatures being exclusive to this habitat/location, along with only a few cosmopolitan
67
taxa that are shared with other (non-tropical forest) habitats (9, 10). Although tropical forest soils
68
are critical ecosystems that host a plethora of distinct ecological niches, little is known about the
69
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
4
metabolic potential of tropical soils, especially, across elevation and depth gradients. Describing
70
this metabolic diversity is important for studying and monitoring the microbial activities related
71
to greenhouse gas fluxes, namely, nitrous oxide (N2O) and carbon dioxide (CO2) from the
72
tropical soils (11).
73
Notably, tropical forests represent the largest terrestrial sinks of atmospheric CO2 and the
74
largest natural source of N2O emissions (12-15). Natural soils have been reported to contribute
75
over 43% of the total global N2O emissions, with tropical ecosystems being the highest
76
contributors, having 2 to 4 times higher contributions compared to natural temperate ecosystems
77
(16-19). These soils are also responsible for about 70% of terrestrial nitrogen fixation, which
78
underlies, at least in part, their high rates of net primary productivity (11, 20).
79
Microbially-mediated nitrification and denitrification are the biotic processes contributing
80
the most to global N2O soil emissions (60-70%) (19, 21, 22), although chemodenitrification, i.e.,
81
ferrous iron generated by ferric iron-reducing bacteria reacting with nitrite to produce N2O
82
abiotically, is also likely high in iron-rich tropical soils (23). In soils, N2O is biologically
83
produced as a result of incomplete nitrification, DNRA (dissimilatory nitrite reduction to
84
ammonium) or denitrification respiratory pathways (22, 24, 25). Respiratory nitric oxide
85
reductase (nor) is a key contributor to the microbial production of N2O and is commonly
86
encoded in the genome of denitrifying bacteria as well as some ammonia-oxidizing organisms
87
(22, 26-30).
88
While both biotic and abiotic processes contribute to N2O production, consumption of
89
N2O is exclusively mediated by microbial N2O reductase (NosZ) activity (31-34). Yet, whether
90
the denitrifying microorganisms in these soils differ from their counterparts in temperate soils
91
and, if their functional genes present in the community reflect the high nitrogen fluxes, remain
92
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
5
unanswered questions despite their apparent importance for better management and modeling of
93
tropical soil ecosystems. It has also been demonstrated that tropical forests have significantly
94
higher rates of nitrogen fixation (~70% of total terrestrial nitrogen fixation) compared to other
95
ecosystems, significantly affecting the nitrogen budgets in these ecosystems (3, 35-37). For
96
instance, higher rates of nitrogen fixation in soils have been linked to nitrous oxide emissions (N
97
loss) due to reduced N retention capacities (11, 38, 39). How these ecosystem rates translate to
98
the nitrogen-fixing microbial (sub)community diversity and gene potential remains unclear.
99
The Luquillo Experimental Forest (LEF), also known as the El Yunque National Forest in
100
Puerto Rico (PR), has been a long term ecological research (LTER) site since 1988. The site is
101
dedicated to the assessment of the effects of climate drivers on the biota and biogeochemistry.
102
The forest has been subjected to several disturbance regimes over the last few decades, mostly
103
natural and -to a smaller extent- anthropogenic such as tourism and experimental manipulations
104
(40, 41). This site encompasses distinct life zones characterized by sharp environmental
105
gradients even across small spatial scales (40, 42, 43). The broad life zones based on the
106
Holdridge classification system include the rain forest, wet forest, lower montane wet forest, and
107
lower montane rain forest. These life zones are distinguished by elevation, temperature and
108
rainfall patterns in addition to other edaphic factors (44-47). The elevation and rainfall patterns
109
also tend to influence oxygen availability, redox potential, nutrient uptake and organic
110
decomposition rates (44, 47, 48). The dynamic interplay of existing physicochemical gradients
111
and climatic factors gives rise to a complex mosaic of biodiversity patterns observed in this
112
forest (45). Hence, LEF represents an ideal environment to study tropical microbial community
113
diversity patterns and their impacts on carbon and nitrogen cycling. The four sampling sites of
114
this study were chosen to represent the distinct vegetation and life zones within the LEF.
115
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
6
Previous studies in the LEF, and similar forest regions, have mostly focused on the
116
effects of redox dynamics, litter decomposition, nitrogen (N) and other nutrient fertilization on
117
microbial community activity through enzyme assays. Few studies have examined microbial
118
diversity patterns across an elevation gradient and were only based on low-resolution techniques
119
such as terminal restriction fragment length polymorphism analysis (14, 49-53). Furthermore,
120
studies linking marker-gene abundances (related to nitrogen cycling) with in-situ flux
121
measurements showed very high N2O fluxes in the forest soils (54). However, the nosZ primers
122
targeted only the typical (Clade I) clades, thereby introducing a primer bias, which can be
123
circumvented by employing metagenomic analyses.
124
With recent developments in next generation DNA sequencing and associated
125
bioinformatics binning algorithms, near-complete metagenome-assembled genomes (MAGs) can
126
been recovered without cultivation (55, 56), opening new windows into studying soil microbial
127
communities. Here, shotgun metagenomes originating from soils from the four different
128
locations/life zones and three different depths in the LEF were analyzed to describe the microbial
129
community diversity, biogeographical patterns, and metabolic potential differences across
130
samples. Furthermore, the metagenomic data obtained from these soils were also compared to
131
similar data from temperate grasslands in Oklahoma (OK) (57) and agricultural soils from
132
Illinois (IL), USA (56) obtained previously by our team. By analyzing near-complete MAGs, we
133
show that the most abundant microbial population (based on number of reads recruited) at each
134
of the sampling locations represent sequence-discrete populations, similar to those observed in
135
other habitats (58). Using such sequence-discrete populations as the fundamental unit of
136
microbial communities, we subsequently assess the population distribution at high resolution
137
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
7
across the sampling sites (biogeography) and the gene content they encoded, with a focus on
138
nitrogen metabolism.
139
140
RESULTS
141
142
Diversity of forest microbial communities
143
The LEF soil communities were compared to those of intensively studied ecosystems,
144
namely the Oklahoma temperate grassland (OK) (57, 59) and Illinois agricultural soils (IL) (56),
145
which were previously characterized with similar shotgun metagenomics approaches. Shotgun
146
metagenomic sequencing recovered a total of 370 million reads across the 4 sites (Suppl. Table
147
S2). Nonpareil 2.0 (60) was used to estimate sequence coverage, i.e., what fraction of the total
148
extracted community DNA was sequenced. Nonpareil analysis of community diversity (Suppl.
149
Fig. S1) showed that the agricultural Urbana (IL) site had the highest diversity of all the soils
150
compared (NP diversity 24.02; note that NP values are given in log scale), and consequently, the
151
lowest sequence coverage at (only) 37.23%. El Verde and Pico del Este (20-30cm) were the least
152
diverse or most completely sequenced with 87.1% and 73.4% coverage respectively (NP
153
diversity of 19.6 and 20.6 respectively or about 2-3 orders of magnitude less diverse). Overall,
154
OK and IL soils appear to be more diverse than the PR soils by about two orders of magnitude,
155
on average, with an average Nonpareil value of 22.75 0.37. Nearly complete coverage for El
156
Verde and Pico del Este (20-30cm samples) would require 2.402e+09bp and 8.735e+09bp,
157
respectively, and, for the same level of coverage, the more complex communities in Urbana (IL)
158
would require a substantially higher sequencing effort of 1.282e+12bp. The OK soils had an
159
estimated sequencing depth of 2.063e+11 1.436e+11 bp.
160
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
8
161
Community composition variation across the forest sites based on 16S rRNA gene
162
sequences.
163
The number of total 16S-rRNA gene-based OTUs (Operational Taxonomic Unit)
164
observed in each metagenome as well as the Chao1 estimate of total OTUs present reflected the
165
degree of undersampling at each site (Suppl. Fig. S1 and S2), and were also consistent with the
166
Nonpareil coverage estimates (Fig. 1). When Puerto Rico tropical soils (PR) were compared with
167
the agricultural and grassland soils from the United States at the phylum level, Proteobacteria,
168
Acidobacteria and Actinobacteria were the most abundant taxa across all ecosystems. However,
169
in the forest soils, a few highly abundant OTUs dominated the entire soil community whereas in
170
the OK and IL soils, OTUs were more evenly distributed (Suppl. Fig. S2), consistent with the
171
Nonpareil diversity results. Only 1.28% of the total detected OTUs (out of a total 8019, non-
172
singleton OTUs) were shared among all PR samples, while 49.95% of OTUs were exclusive to a
173
particular sampling site in PR, reflecting partly the under-sampling of the extant diversity by
174
sequencing. Only 0.37% of the OTUs (out of a total 13760, non-singleton OTUs) were shared
175
among all the sites across all 3 ecosystems, all of which were assignable to Alphaproteobacteria,
176
Acidobacteria, Verrucomicrobia and Actinobacteria.
177
Further, applying four additional DNA extraction methods on a selected subset of our
178
samples, including two manual phenol chloroform-based methods that are often advantageous
179
for iron rich soils like those in tropical forest, revealed similar levels of diversity, more or less
180
(Suppl.Fig. S3). Hence, the diversity patterns reported here are robust and independent of the
181
DNA method used.
182
183
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
9
Factors driving community diversity in the forest soils: Multidimensional scaling analysis
184
of beta diversity
185
The PCoA (Principal Coordinate Analysis) plots, constructed based on the MASH
186
distances among whole metagenomes, showed a clustering pattern that was primarily governed
187
by site/location. Accordingly, site explained 45.22% of the total diversity (Fig. 2B). The non-
188
metric multidimensional scaling (NMDS) analysis of the data revealed only site, pH and soil
189
moisture to be statistically significant physicochemical parameters in explaining the observed
190
community diversity (Fig. 2C, Suppl. Table S3). ANOSIM values also indicated site to be a
191
more important factor than depth, with a P value of 0.001 and 0.94, respectively. Based on the
192
distance-based redundancy analysis (dbRDA), site was the most significant factor, even when the
193
interplay between site and sampling depth was accounted for (Suppl. Table S4). Table 1 shows
194
the partitioning of the variance between the proportion that is explained by constrained axes (i.e.,
195
environmental variables measured) and the porportion explained by unconstrained axes (i.e.,
196
variance not explained by environmental variables measured). The total variance explained by all
197
(measured) environmental variables was 80.2% (Table. 1), which is remarkably high for a soil
198
ecosystem (61).
199
200
Major N cycling pathways
201
Genes encoding proteins involved in denitrification and nitrogen fixation were the most
202
abundant nitrogen (N) cycling pathway genes detected at different sites. Overall, the forest soils
203
harbored about a 2-3-fold higher abundance of denitrification genes, i.e., narG, nirK, and norB
204
catalyzing the reduction of nitrate, nitrite, and nitric oxide, respectively, compared to the
205
grassland and agricultural soils (Fig. 2A). For instance, the norB gene abundance was found to
206
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
10
be at the highest abundance among the denitrification genes, with ~37% (SD 9.5%) of the
207
genomes in the PR soils predicted to contain a norB gene, compared to ~17% (SD 4%) and
208
~14% (SD 1.3%) at IL and OK, respectively. Similarly, narG showed a 3-fold higher abundance
209
in the PR soils compared to IL and OK soils (Fig.2B). While denitrification gene abundances
210
appeared higher in the tropical soils, the relative abundance of nosZ gene, i.e., 11.6% (SD 3%) of
211
the total genomes across the four locations in the LEF were predicted to encode nosZ, similar to
212
nosZ relative abundance in IL and OK soils, i.e., 11.75% (SD 5%) and 11.08% (SD 3%),
213
respectively (not statistically significant at p=0.05). Similar to nosZ, DNRA gene abundances
214
(namely, nrfA) was similar across all sites studied herein (9%, SD 1.9%).
215
216
Predominant NosZ clades are shared among soil ecosystems
217
Placing nosZ-encoding reads to a reference nosZ phylogenetic tree revealed that atypical
218
clades (clade II nosZ), affiliated predominantly with Opitutus, Anaeromyxobacter and other
219
closely related genera, dominated the nosZ gene pool in the tropical forests (Figs. 3, Suppl.
220
Figs.S4-S7). In contrast, a very small fraction of reads (<10% of total nosZ reads) were recruited
221
to typical nosZ clades (or clade I). Members belonging to the clade II nosZ dominated the nosZ
222
gene pool in OK and IL soils as well, with IL agricultural soils showing the greatest nosZ
223
sequence diversity among the three regions. Notably, O. terrae-affiliated sequences represented
224
the most abundant sub-clade (nosZ OTUs/sub-clades were defined at the 95% nucleotide
225
sequence identity level) in all regions. Furthermore, most of the O. terraeaffiliated reads in the
226
forest soil dataset appeared to be assigned to a single sub-clade, while their counterparts in the
227
OK and IL soils appeared to be more evenly distributed among several closely related nosZ sub-
228
clades, i.e., showing higher sequence diversity (Fig. 3, Suppl. Figs. S4-S7). O. terrae (strain
229
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
11
DSM 11246/PB90-1) nosZ reads at >95% identity made up between 20% and 60% of the total
230
nosZ reads recovered from the El Verde site and, together with the second most abundant sub-
231
clade from Anaeromyxobacter sp., contributed over 30% of the total nosZ reads across all four
232
PR locations (Fig. 5). Despite the significant taxonomic diversity observed in these soils (Suppl
233
Fig. S2), the soils from PR shared several abundant nosZ gene sequences/sub-clades at >95
234
nucleotide identity with soils in OK and IL (Fig. 3). Furthermore, in order to compare the
235
predominant nosZ clades across the samples shown here, a new phylogenetic reference tree was
236
constructed based on almost full length sequences obtained from the assemblies/MAGs obtained
237
from the metagenomes studied here (namely PR,OK,IL). The short-reads identified as nosZ from
238
the PR soils were placed on this tree and show that the majority of these reads are recruited by
239
the nosZ sequences obtained from these assemblies/MAGs, indicating that the nosZ sequences
240
across these ecosystems studies here are similar (Suppl. Fig. S8)
241
242
Nitrogen fixation potential
243
The nitrogen fixation genes (mainly nifH) were present at a much lower abundance in the
244
lower altitude forest samples. For instance, only ~1-3% of all genomes in the lower altitude
245
samples were predicted to encode nifH compared to a ~20% of the genomes in the higher
246
elevation samples (Pico del Este) (Fig. 2A), and almost none of the reads from IL and OK
247
metagenomes appeared to encode nifH (<0.1%). Therefore, nitrogen fixation gene abundance
248
patterns indicated a much stronger selection for nitrogen fixation in the tropical forest relative to
249
temperate agricultural or natural prairie soils, especially at higher elevations. Furthermore, no
250
ammonia oxidizing genes (amoA) were detected in any of the soils except for Urbana soils (IL),
251
which had a history of fertilizer (N) input.
252
253
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
12
Recovery of metagenome-assembled genomes (MAGs) representative of each site
254
In order to test the effect of biogeography (i.e., limits to dispersion) of taxa across the
255
elevation gradient sampled, the distribution of abundant MAGs recovered from each PR
256
sampling site \ (assembly and MAG statistics provided in Suppl. Table S6) were assessed across
257
the sites using read-recruitment plots (62). Taxonomic assignment using the Microbial Genomes
258
Atlas (63) revealed that the most abundant MAG at site El Verde (lowest elevation), representing
259
4.39% of the total metagenome, and was affiliated with an unclassified Verrucomicrobia. The
260
second most abundant (1.8% of total) was likely a member of the genus Ca. Koribacter
261
(Acidobateria) followed by an unclassified member of Acidobacteria (1.45% of total). The
262
Verrucomicrobium MAG was found at an abundance of 1.03% of the total population at Sabana,
263
and at 0.07% and 0.03% in Palm Nido and Pico del Este (highest elevation), respectively.
264
Uneven coverage across the length of the reference sequence and nucleotide sequence identities
265
were observed in the recruitment of short-reads from Palm Nido and Pico del Este as well as
266
with all OK datasets, indicating that the related populations in the latter samples were divergent
267
from the reference MAG (Suppl. Fig. S10). Therefore, at least this abundant low-elevation
268
Verrucomicrobial population did not appear to be widespread in the other samples analyzed here
269
(Suppl. Fig. S10). Similarly, the other abundant MAGs from other sites in the forest soils were
270
unique to the corresponding sites (elevation) from which they were recovered. Almost all MAGs
271
used in the analyses were assignable to a novel family, if not higher taxonomic rank, according
272
to MiGA analysis (when compared to 11,566 classified isolate genomes available in the NCBI
273
prokaryotic genome database), underscoring the large unexplored diversity harbored by the PR
274
tropical rainforest soils. The sequence diversity/complexity as well as sequencing depth limited
275
large-scale recovery of high-quality MAGS.
276
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
13
277
Functional gene content of the MAGs
278
The genome sequences of the most abundant MAGs from each location (n=6) were
279
analyzed in more detail to assess the functions they encoded, especially with respect to N cycling
280
pathways (Fig. 4). MAGs from Pico del Este (highest elevation) showed a high abundance of N
281
metabolism related genes compared to MAGs from other sites (Fig. 4). Most notably, genes
282
related to nitrogen fixation were found only in the Pico del Este MAG, which was consistent
283
with the short read analysis datasets showing greater relative abundance of nifH at this site.
284
Nitrification (ammonia oxidation related genes) gene clusters were not detected in any of the
285
recovered MAGs. norB and nosZ genes were found in three out of the six abundant MAGs
286
analyzed. The most abundant El Verde MAG, most closely related to O. terrae (AAI = 40 %),
287
possessed a nosZ gene, which was congruent with the nosZ phylogeny described above (i.e.,
288
~60% of the nosZ-encoded reads from El Verde had a closest match to O. terrae nosZ
289
sequences).
290
291
DISCUSSION
292
293
The present study reported the taxonomic and gene content diversity of the poorly
294
characterized tropical rainforest soils by using whole-community, shotgun metagenomic
295
sequencing of samples from the El Yunque forest, Puerto Rico. The recovered near-complete
296
MAGs represented several abundant and widespread organisms within this ecosystem that could
297
serve as model organisms for future studies. Furthermore, since the Luquillo Experimental Forest
298
(LEF) within El Yunque is subjected to varying natural as well as experimental (e.g., warming,
299
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
14
phosphorus fertilization) perturbations, our study could also provide a baseline for these
300
perturbations and future soil microbial studies at LEF. Our results revealed that the LEF soils
301
harbor distinct microbial communities at sites with distinct elevation from sea-level. In contrast,
302
and unlike several other soil ecosystems, sampling depth did not have a substantial impact on
303
structuring community diversity, revealing no depth stratification in the LEF soils, at least for the
304
depths sampled here (5-30cm). This could be due to the lack of distinct soil horizons within the
305
first 30cm of the sampling sites, and indicates that the soil formation and/or physicochemical
306
properties in these ecosystems could differ markedly from those in their temperate counterparts
307
(44).
308
A recent study examining the dominant bacterial phylotypes across the globe found that
309
the predominant phylotypes were widespread across ecosystems. The only exception to this
310
pattern was the forest tropical soils which harbor distinct phylotypes (10). Consistent with these
311
conclusions, the MAGs recovered from each LEF site represented at least novel species and
312
genera, further underlining the under-tapped microbial diversity harbored by tropical forest soils.
313
Currently, the environmental factors driving these diversity patterns remain poorly understood
314
for tropical forest soils (10), but our study provided several new insights into this issue.
315
In particular, sites El Verde and Sabana (lowest elevation sites) had similar community
316
structure and diversity compared to the two higher-elevation sampling sites with certain MAGs
317
being present at both sites but not in any of the other (higher-elevation) sites examined. This is
318
presumably attributable to both sites having similar climate and vegetation patterns (i.e.,
319
Tabonuco forest). On the other hand, Pico del Este was the highest elevation site and experiences
320
almost continuous cloud cover as well as horizontal precipitation. The unique topology of Pico
321
del Este was reflected in distinct and deeply novel MAGs and gene content, which differed
322
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
15
markedly from the other three sampling sites within the LEF (PCoA plots, Fig. 2B). The high
323
water content of the Pico del Este soils gives rise to a unique ecosystem dominated by epiphytes
324
(e.g., moss) (64). The epiphytic community has presumably significant impacts on nutrient (e.g.,
325
nitrogen) cycling (65), and influences the water input to the soil, thereby shaping a unique
326
habitat/niche for the soil microbes. Free-living microbes have been shown to be one of the
327
highest contributors to biological N fixation in these forests with high rates of nitrogenase
328
activity associated with the presence of moss/epiphytes (53, 66). Consistent with these previous
329
results and interpretations, the Pico del Este showed an extremely high potential for nitrogen
330
fixation, i.e., it was estimated that 1/5 of the total bacterial genomes sampled possessed genes for
331
N fixation, which is at least 10 times greater than any other site evaluated herein. Accordingly,
332
we found that site (location) alone explained about half (45%) of the beta diversity differences
333
observed among the four sampling sites, which reached ~80% when a few physicochemical
334
parameters namely pH and moisture were also included in the analyses (Fig. 2B, Table 1). This
335
is a remarkably high fraction of beta diversity explained by measured parameters for a soil
336
ecosystem (61) and likely reflected that location and the physical properties that characterized
337
different locations within LEF structured diversity much stronger than in other soil ecosystems.
338
Tropical forests have also been shown to have significantly higher rates of nitrogen fixation
339
compared to other ecosystems, which can exceed the N retention capacity of the soil resulting in
340
large N loss as N2O (67). The findings reported here on denitrification gene abundances were
341
generally consistent with these previous observations as well.
342
Links between soil community structure and nitrogen cycling can help close the
343
knowledge gaps on how the forest ecosystems impact the release and mitigation of certain highly
344
potent greenhouse gases such as N2O. The gene abundances observed here, e.g., more than two-
345
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
16
fold higher abundance of norB (associated with NO reduction to N2O) and similar nosZ (N2O
346
consumption) abundances in tropical soils relative to temperate soils were consistent with higher
347
N2O emissions observed previously from the tropics. Further, in acidic soils such as the tropical
348
forest soils evaluated in this study, lack of N limitation can suppress complete denitrification,
349
thereby leading to higher N2O release compared to other soil ecosystems (35). These
350
interpretations were consistent with our observation that the PR soils harbored a relatively high
351
abundance of respiratory (related to denitrification) norB genes as well. Previous studies have
352
also suggested that most denitrifying bacterial genomes possess the genes required to reduce
353
nitrate to nitrous oxide but do not possess the gene responsible for the last step i.e., N2O
354
reduction to N2, leading to the release of N2O gas (Braker and Tiedje, 2003; Richardson et al.,
355
2009; Giles et al., 2012; (22, 26-29), consistent with the findings of our study.
356
It has been established that tropical forest soils are the single highest contributor of
357
natural N2O emissions. While several abiotic and microbial processes can contribute to soil N2O,
358
N2O consumption is an exclusively microbial process, catalyzed by the enzyme product of the
359
nosZ genes (34). Based on the assessment of the nosZ gene phylogeny, it appears that almost all
360
of the nosZ genes from the tropical forest soils studied here belong to a previously overlooked
361
Clade II or atypical nosZ genes (32, 34, 68). This clade consists mainly of non-denitrifying, and
362
secondary denitrifying N2O reducers. Despite the unique phylogenetic diversity harbored by
363
tropical soils in general, the nosZ gene sequence diversity appears to be shared between
364
temperate and agricultural soils (Fig 4). These findings imply strong selection pressure for
365
conservation of nitrous oxide reductase sequences across tropical and temperate soil ecosystems
366
that are not apparently applicable to other N-cycling genes and pathways, which warrants further
367
attention in the future.
368
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
17
Integration of functional (e.g., gene expression) data with in-situ rate measurements will
369
provide a more complete picture of the composition and functioning in tropical forest soils. The
370
identification of certain biomarker genes such as nosZ sequences in our study could facilitate
371
future investigations on biogeochemical N-cycling and greenhouse gas emissions. For instance,
372
the assembled MAGs and gene sequences provided here could be useful for the design of
373
specific PCR assays for assessing transcript levels (activity), allowing potential linking of carbon
374
dioxide, methane, nitrogen, SOM, etc. turnover to the activity of individual populations. It would
375
also be interesting to assess how the findings reported here for the LEF apply (or not) to other
376
tropical forests especially because our study is based on a relative small sample size. While the
377
diversity in the Puerto Rico soils appears to be lower than that in temperate grassland and
378
agricultural soils, and different DNA extraction methods, including phenol-chloroform- and kit-
379
based, provided for similar results (Fig. S3), it is important to note that DNA of the temperate
380
soil samples was extracted using different methods (OK soils were extracted using the PowerSoil
381
kit). Therefore, it would be important to confirm these preliminary findings by using the exact
382
same DNA extraction and sequencing procedures in all soils. Despite the sample size, however,
383
our results showed differences along the elevation gradient sampled at the LEF that are
384
independent of DNA extraction (Suppl.Fig. S3) or sequencing methods, and consistent with our
385
metadata (Fig. 2), and previous process rate measurements. As the gradients at the LEF also
386
provide a natural setting to interpret the potential ramifications of climate change scenarios such
387
as altered participation patterns, the DNA sequences provided here could facilitate future
388
manipulation experiments with an emphasis on understanding and predicting the effects of
389
climate change on microbial community dynamics along the elevation gradient.
390
391
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
18
MATERIALS AND METHODS
392
393
Sampling sites
394
Soil samples were collected on February 2016 from four locations/sites across the LEF (18.3′ N,
395
65.80′ W). The four sites namely, Sabana, El Verde field station, Palm Nido and Pico del Este,
396
each located at different elevations from the mean sea level, i.e., 265, 434, 634 and 953 m,
397
respectively, were chosen due to their unique landscape and rainfall patterns, thereby creating
398
distinct ecological niches (Fig. 2A).
399
The El Yunque forest is categorized into four distinct vegetation zones namely, the
400
Tabonuco, Palo Colorado, Sierra Palm and Dwarf/Elfin forests. Site Sabana and El Verde, which
401
are located at the lowest elevation among the four sites within the LEF, fall under the Tabonuco
402
forest category in terms of vegetation, dominated by the tree species Dacryodes excelsa (native
403
to Puerto Rico). They are characterized by canopy cover and low light intensities at the ground
404
level which account for the sparsely vegetated forest floor. However, these sites still harbor the
405
richest flora of all sites (69). Palm Nido is characterized by unstable, wetter soils, steeper slopes
406
and the vegetation is dominated by the Sierra Palm (Prestoea montana). The site at the highest
407
elevation, Pico del Este (dwarf forest ecosystem or “elfin woodlands”) is characterized by higher
408
winds, lower temperatures and the vegetation is enveloped by clouds (41, 70) and its main
409
vegetation is comprised of moss and epiphytes. Furthermore, highly acidic soil and continuously
410
water-saturated soils deficient in oxygen are some major characteristics of this ecosystem with
411
most mineral inputs for plants become dissolved in the rain and fog.
412
Three adjacent soil profiles were taken from each of the four LEF sites (4 sites
413
encompassing 3 lifezones, Palo Colorado was not sampled). For each profile, individual soil
414
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
19
cores were taken at each depth (0-5cm, 5-20cm, 20-30cm) using a 3-cm diameter x 15-cm length
415
soil corer (AMS Inc, Idaho) that was decontaminated between samplings by washing with 70%
416
ethanol. Soil samples were stored in sterile Whirl-pak bags and kept on ice during transport and
417
until storage at -80º C. The three cores at each sampling depth were pooled together for
418
community DNA extraction, producing a total of twelve samples across the four sites.
419
Soil pH was determined using an automated LabFit AS-3000 pH Analyzer, and soil
420
extractable P, K, Ca, Mg, Mn, and Zn were extracted using the Mehlich-1 method and measured
421
using an inductively coupled plasma spectrograph at the University of Georgia Agricultural and
422
Environmental Services Laboratories (Athens, GA, USA). Soil extractable P using this method is
423
interpreted as the bioavailable fraction of P. NH4-N and NO3-N were measured by first
424
extracting them from soil samples with 0.1 N KCl, followed by the colorimetric phenate method
425
for NH4 + and the cadmium reduction method NO3. The physicochemical conditions at the sites
426
during the time of sampling are provided in Supplementary Table (S1).
427
428
Community DNA extraction and sequencing
429
Total DNA from soil was extracted using the FastDNA SPIN KIT (MP Biomedicals, Solon, OH)
430
following manufacturer’s procedure with the following modifications (71). Soils were air dried
431
under aseptic conditions followed by grinding employing a mortar and pestle. Cells were lysed
432
by bead beating and DNA was eluted in 50 µl of sterile H2O. DNA sequencing libraries were
433
prepared using the Illumina Nextera XT DNA library prep kit according to manufacturer’s
434
instructions except the protocol was terminated after isolation of cleaned double stranded
435
libraries. Library concentrations were determined by fluorescent quantification using a Qubit HS
436
DNA kit and Qubit 2.0 fluorometer (ThermoFisher Scientific), and samples were run on a High
437
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
20
Sensitivity DNA chip using the Bioanalyzer 2100 instrument (Agilent) to determine library insert
438
sizes. An equimolar pool of the sequencing libraries was sequenced on an Illumina HiSeq 2500
439
instrument (located in the School of Biological Sciences, Georgia Institute of Technology) using
440
the HiSeq Rapid PE Cluster Kit v2 and HiSeq Rapid SBS Kit v2 (Illumina) for 300 cycles (2 x
441
150 bp paired end). Adapter trimming and demultiplexing of sequenced samples was carried out
442
by the HiSeq instrument. In total, 12 metagenomic datasets were generated (3 per site for the
443
three depths), and statistic details on each dataset are provided in Supplementary Table S2.
444
In order to test for any DNA extraction biases of the kit used above, especially for the
445
high iron/clay content that characterizes tropical forest soils and is known to affect the extraction
446
step, four additional DNA extraction methods were performed in parallel on a small subset of
447
samples collected in 2018 from the same sites (6 samples per extraction method for 5 ecxtraction
448
methods covering the 4 sites). The methods included two manual (as opposed to kit-based)
449
phenol-chloroform based methods (72, 73) as well as two other kit-based methods namely;
450
DNeasy PowerSoil and DNeasy PowerSoil Pro (Qiagen Inc.). For this evaluation, the soils were
451
first homogenized and subsequently in five subsamples to use with each method (including the
452
FastDNA SPIN KIT-based method mentioned above). The libraries were constructed and
453
sequenced the same way as described above for the FastDNA SPIN KIT method.
454
All metagenomic datasets were deposited in the European Nucleotide Archive (ENA) under
455
project PRJEB26500. Additional data is available at http://enve-omics.ce.gatech.edu/data/prsoils.
456
457
Bioinformatics analysis of metagenomic reads and MAGs
458
The paired end reads were trimmed and quality checked using the SolexaQA (74) package with a
459
cutoff of Q>20 (>99% accuracy per base-position) and a minimum trimmed length of 50 bp.
460
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
21
i) Assembly and population genome binning: Co-assembly of the short reads from the same
461
location was performed using IDBA-UD (75) and only resulting contigs longer than 500 bp in
462
length were used for downstream analysis (e.g. functional annotation and MyTaxa
463
classification). Genes were predicted on the co-assembled contigs using MetaGeneMark (76) and
464
the predicted protein-coding regions were searched against the NCBI All Genome database using
465
Blastp (77). Since the assembly of individual datasets resulted mostly in short contigs (data not
466
shown), the contigs from the co-assembly (combining metagenomes from the three sampling
467
depths, for each site) were used for population genome binning. Contigs longer than 1Kbp were
468
binned using MaxBin (78) to recover individual MAGs (default settings). The resulting bins
469
were quality checked for contamination and completeness using CheckM (79), and were further
470
evaluated for their intra-population diversity and sequence discreteness using fragment
471
recruitment analysis scripts as part of the Enveomics collection (62) essentially as previously
472
described (80).
473
ii) Functional annotation of MAGs: Genes were predicted for each MAG using MetaGeneMark
474
and the predicted protein-coding regions were searched against the curated Swiss-Prot (81)
475
protein database using Blastp (77). Matches with a bitscore higher than 60 or amino acid identity
476
higher than 40% were used in subsequent analysis. The Swiss-Prot database identifiers were
477
mapped to their corresponding metabolic function based on the hierarchical classification
478
subsystems of the SEED subsystem category (Level 1) (82). The relative abundance of genes
479
mapping to each function was calculated based on the number of predicted genes from each
480
MAG assigned to the function (for read-based assessment, see below). Relative abundance data
481
were plotted in R using the “superheat” package (https://arxiv.org/abs/1512.01524). Individual
482
biomarker genes for each step of the nitrogen cycling pathway were manually verified by
483
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
22
visually checking the alignment of the identified sequences by the pipeline outlined above
484
against verified reference sequences.
485
iii) Functional annotation of short reads: Protein-coding sequences present in short reads were
486
predicted using FragGeneScan (83) using the 1% Illumina error model. The predicted genes were
487
then searched against the Swiss-Prot database using Blastp (best match). Low quality matches
488
(bitscore < 60) were excluded, and relative abundance of genes mapping to each function was
489
determined as described in the previous section.
490
491
Community diversity estimation
492
i) Nonpareil: Nonpareil (60) was used to estimate sequence coverage, i.e., what fraction of the
493
total extracted community DNA was sequenced and predict the sequencing effort required to
494
achieve "nearly complete coverage"(≥95%). The default parameters in Nonpareil were used for
495
all datasets. Only one of the two paired reads (forward) for each dataset was used to avoid
496
dependency of the paired reads, which can bias Nonpareil estimates (60).
497
ii) MASH and multidimensional scaling: MASH, a tool employing the MinHash dimensionality
498
reduction technique to compare sample-to-sample sequence composition based on k-mers (84),
499
was used to compute pairwise distances between whole metagenomic datasets and construct the
500
distance matrix to be used in multidimensional scaling. Pairwise MASH distances between the
501
metagenomic datasets were computed from the size-reduced sketches (default parameters).
502
PCoA (Principal coordinate analysis) and NMDS (Non-metric multidimensional scaling) were
503
employed to visualize the distance matrix and evaluate the physicochemical parameters driving
504
community diversity, respectively. Furthermore, dbRDA (distance based redundancy analysis),
505
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
23
was used to obtain a finer resolution on the observed compositional variation. All of the above
506
startistical analysis were performed using the vegan package in R (85), with default settings.
507
iii) 16S rRNA gene fragments recovered from shotgun metagenomes: 16S ribosomal rRNA (16S)
508
gene fragments were extracted from the metagenomic datasets using Parallel-META (86). 16S-
509
carrying reads were classified taxonomically using the GreenGenes database.
510
Recovered 16S fragments were clustered (‘closed-reference OTU picking’ strategy using
511
UCLUST (87)) and taxonomically classified based on their best match in the GreenGenes
512
database (88) at an ID 97% in QIIME (89, 90). The relative abundance of the OTUs were
513
calculated based on the number of reads assigned to each OTU. Community composition was
514
assessed based on OTU taxonomic assignments at the genus and the phylum ranks and was
515
compared between the sites based on the relative abundance of OTUs at each site.
516
517
Identification of N cycling genes using ROCker
518
ROCker (91) was employed for a precise identification and quantification of nosZ (encoding
519
nitrous oxide reductase), norB (encoding respiratory nitric oxide reductase, cytochrome bc
520
complex associated), nirK (encoding nitrite reductase), narG (encoding nitrate reductase), nrfA
521
(encoding nitrite reductase, DNRA related) amoA (encoding ammonia monooxygenase) and nifH
522
(encoding nitrogenase) encoding metagenomic reads (http://enve-
523
omics.ce.gatech.edu/rocker/models). Briefly, the short-read nucleotide sequences were searched
524
(using Blastx) against a training set for each abovementioned protein; training sets were
525
manually curated to encompass experimentally verified reference sequences as suggested
526
previously (91). The resulting matching sequences were then filtered using the ROCker compiled
527
model (model for 150bp-long reads for PR and OK soils and 100 bp model for IL soils). Protein
528
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
24
abundances (based on the number of reads assigned to the protein) were normalized by
529
calculating genome equivalents. For the latter, the ROCker-filtered read counts were normalized
530
by the median length of the sequences of each protein reference, and the corresponding genome
531
equivalents were calculated as the ratio of NosZ (or another protein of interest) read counts to the
532
RNA polymerase subunit B (rpoB), a universal single copy marker, read counts.
533
534
NosZ phylogenetic analysis
535
The NosZ reference protein sequences were aligned were aligned using CLUSTAL Omega (92)
536
and a maximum likelihood reference tree was created using RAxML v 8.0.19 (93) with a general
537
time reversible model option, gamma parameter optimization and ‘-f a’ algorithm. The ROCker
538
identified NosZ-encoding reads were extracted from all datasets, translated into protein
539
sequences using FragGeneScan, and then added to the reference alignment using Mafft (94). The
540
reads were placed in the phylogenetic tree using RAxML EPA algorithm and visualized using
541
iTOL (95).
542
543
Intra-population diversity assessment based on recovered MAGs
544
The taxonomic affiliation of individual contig sequences of a MAG was evaluated based on
545
MyTaxa, a homology based classification tool (96). The MiGA (Microbial Genomes Atlas,
546
www.microbial-genomes.org) webserver was used for the taxonomic classification of the whole
547
MAG using the ANI/AAI concept. To assess intra-population diversity and sequence
548
discreteness, each target population MAG was searched against all the reads from each location
549
by Blastn (only contigs longer than 2Kbp were used). Fragment recruitment plots were
550
constructed based on the Blastn matches (threshold values: nucleotide identity 75% and
551
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
25
alignment length 80bp) using the Enveomics collection of scripts (62). The evenness of
552
coverage and sequence diversity of the reads across the length of the reference genome sequence
553
were used to evaluate the presence and discreteness of the population in the chosen dataset.
554
555
556
Acknowledgments. This work was supported by the U.S. Department of Energy, Office of
557
Biological and Environmental Research, Genomic Science Program (award DE-SC0006662) and
558
US National Science Foundation (award 1831582). GG was supported by the Luquillo Critical
559
Zone Observatory (National Science Foundation grant EAR-1331841) and the Luquillo Long-
560
Term Ecological Research Site (National Science Foundation grant DEB-1239764). All research
561
at the USDA Forest Service International Institute of Tropical Forestry is done in collaboration
562
with the University of Puerto Rico. We thank María Rivera and Humberto Robles from IITF for
563
their help in soil sampling.
564
565
566
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
26
TABLES:
567
Table 1: Proportion of total microbial community diversity explained by measured soil
568
environmental factors.
569
Inertia
Proportion
Rank
Total
0.1092
1
Constrained
0.0876
0.8021
6
Unconstrained
0.02161
0.1978
5
Site, sampling depth, pH, total nitrogen, total carbon, moisture data were considered in the
analysis
570
FIGURE LEGENDS
571
572
Fig. 1: Sampling location map and microbial community diveristy among the study sites. A.
573
Map of the four sampling sites within the Luquillo Experimental Forest (LEF). B. Principal co-
574
ordinate analysis (PCoA) plots based on MASH distances, colored by sampling site, C.
575
Nonmetric multidimensional scaling (NMDS) plot with the soil physicochemical parameters
576
incorporated. The arrow lengths are proportional to the strength of the correlations obtained
577
between measured soil physicochemical parameters and each ordination axis.
578
579
Fig. 2: Abundance of N cycling genes and their distribution across soil ecosystems. A.
580
Abundance of hallmark genes for denitrification, DNRA and nitrogen fixation pathways,
581
represented as genome equivalents (% of total bacterial genomes sampled that carry the gene) in
582
the metagenomes studied (see Figure key). B. Frequency of genomes carrying the respective
583
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
27
denitrifying gene across the three ecosystems studied. Genes denoted by the same letter are not
584
statistically significantly different between ecosystems (ANOVA Tukey test). Statistical
585
significance reported at p < 0.05. Note that nitrification genes were not detected in any of the
586
Puerto Rico sites.
587
588
Fig. 3: Phylogenetic diversity of nosZ-encoding sequences recovered in each soil ecosystem.
589
nosZ sequences were identified by the ROCker pipeline and placed in a reference nosZ
590
phylogeny as described in the Materials and Methods section. The radii of the pie charts are
591
proportional to the number of reads assigned to each sub-clade and the colors represent the
592
sampling sites from each ecosystem (see Figure key). Sub-clades highlighted in grey indicate the
593
most abundant sub-clades across all three ecosystems whereas the ones highlighted in blue were
594
abundant only in agricultural soils (IL). A. nosZ reads from every sampling site recruiting to
595
atypical (Clade II) clades. B. nosZ reads recruiting to typical (Clade I) clades. Inset shows the
596
most abundant sub-clade (Opitutus terrae) from panel A and its distribution across all sites. Note
597
that in all three ecosystems most of the reads recruit to atypical sub-clades. Suppl. Fig. S7 shows
598
the distribution of the reads among the most abundant sub-clades in detail.
599
600
Fig. 4: Functions encoded by the recovered population MAGs. Heatmap showing the relative
601
abundance of genes encoding the major metabolic functions (Level 1 of the SEED subsystem
602
category) for each MAG recovered from the four sites in Puerto Rico. The taxonomic
603
classification of each MAG based on MiGA is shown on the bottom left. The symbols at the
604
bottom of the heatmap denote the presence (or absence) of specific N-cycling genes, namely
605
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
28
denitrification and nitrogen fixation. No genes involved in nitrification were detected in any of
606
the bins.
607
FIGURES
608
609
610
611
612
Figure 1: Sampling location map and microbial community diveristy among the study sites.
613
614
615
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
29
616
617
618
619
620
Figure 2: Abundance of N cycling genes and their distribution across soil ecosystems.
621
622
623
624
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
30
625
626
627
Figure 3: Phylogenetic diversity of nosZ-encoding sequences recovered in each soil
628
ecosystem.
629
630
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
31
631
632
Figure 4: Functions encoded by the recovered population MAGs.
633
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
32
REFERENCES
634
635
636
1. Luo C, Rodriguez-R LM, Johnston ER, Wu L, Cheng L, Xue K, Tu Q, Deng Y, He Z,
637
Shi JZ, Yuan MM, Sherry RA, Li D, Luo Y, Schuur EAG, Chain P, Tiedje JM, Zhou J,
638
Konstantinidis KT. 2014. Soil Microbial Community Responses to a Decade of Warming
639
as Revealed by Comparative Metagenomics. Applied and Environmental Microbiology
640
80:1777-1786.
641
2. Fierer N, Bradford Mark A, Jackson Robert B. 2007. Toward an ecological classification
642
of soil bacteria Ecology 88:1354-1364.
643
3. Van Der Heijden Marcel GA, Bardgett Richard D, Van Straalen Nico M. 2007. The
644
unseen majority: soil microbes as drivers of plant diversity and productivity in terrestrial
645
ecosystems. Ecology Letters 11:296-310.
646
4. Battin TJ, Sloan WT, Kjelleberg S, Daims H, Head IM, Curtis TP, Eberl L. 2007.
647
Microbial landscapes: new paths to biofilm research. Nat Rev Micro 5:76-81.
648
5. Fierer N. 2017. Embracing the unknown: disentangling the complexities of the soil
649
microbiome. Nat Rev Micro 15:579-590.
650
6. Fierer N, Jackson RB. 2006. The diversity and biogeography of soil bacterial
651
communities. Proc Natl Acad Sci U S A 103:626-631.
652
7. DeAngelis KM, Chivian D, Fortney JL, Arkin AP, Simmons B, Hazen TC, Silver WL.
653
2013. Changes in microbial dynamics during long-term decomposition in tropical forests.
654
Soil Biology and Biochemistry 66:60-68.
655
8. Malhi Y, Phillips OL. 2004. Tropical forests and global atmospheric change: a synthesis.
656
Philosophical Transactions of the Royal Society of London Series B: Biological Sciences
657
359:549.
658
9. Nemergut DR, Costello EK, Hamady M, Lozupone C, Jiang L, Schmidt SK, Fierer N,
659
Townsend AR, Cleveland CC, Stanish L, Knight R. 2011. Global patterns in the
660
biogeography of bacterial taxa. Environmental Microbiology 13:135-144.
661
10. Delgado-Baquerizo M, Oliverio AM, Brewer TE, Benavent-González A, Eldridge DJ,
662
Bardgett RD, Maestre FT, Singh BK, Fierer N. 2018. A global atlas of the dominant
663
bacteria found in soil. Science 359:320.
664
11. Pajares S, Bohannan BJM. 2016. Ecology of Nitrogen Fixing, Nitrifying, and
665
Denitrifying Microorganisms in Tropical Forest Soils. Front Microbiol 7:1045.
666
12. Chambers JQ, Silver WL. 2004. Some aspects of ecophysiological and biogeochemical
667
responses of tropical forests to atmospheric change. Philos Trans R Soc Lond B Biol Sci
668
359:463-76.
669
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
33
13. Templer PH, Silver WL, Pett-Ridge J, M. DeAngelis K, Firestone MK. 2008. PLANT
670
AND MICROBIAL CONTROLS ON NITROGEN RETENTION AND LOSS IN A
671
HUMID TROPICAL FOREST. Ecology 89:3030-3040.
672
14. Cusack DF, Silver WL, Torn MS, Burton SD, Firestone MK. 2011. Changes in microbial
673
community characteristics and soil organic matter with nitrogen additions in two tropical
674
forests. Ecology 92:621-632.
675
15. K. Firestone M, Davidson E. 1989. Microbiological Basis of NO and N2O Production
676
and Consumption in Soil, vol 47.
677
16. Frasier R, Ullah S, Moore TR. 2010. Nitrous Oxide Consumption Potentials of Well-
678
drained Forest Soils in Southern Québec, Canada. Geomicrobiol J 27:53-60.
679
17. Schmidt J, Seiler W, Conrad R. 1988. Emission of nitrous oxide from temperate forest
680
soils into the atmosphere. J Atmos Chem 6:95-115.
681
18. Díaz-Pinés E, Werner C, Butterbach-Bahl K. 2018. Effects of Climate Change on CH4
682
and N2O Fluxes from Temperate and Boreal Forest Soils, p 11-27. In Perera AH,
683
Peterson U, Pastur GM, Iverson LR (ed), Ecosystem Services from Forest Landscapes:
684
Broadscale Considerations doi:10.1007/978-3-319-74515-2_2. Springer International
685
Publishing, Cham.
686
19. Butterbach-Bahl K, Baggs EM, Dannenmann M, Kiese R, Zechmeister-Boltenstern S.
687
2013. Nitrous oxide emissions from soils: how well do we understand the processes and
688
their controls? Philos Trans R Soc Lond B Biol Sci 368.
689
20. Houlton BZ, Wang Y-P, Vitousek PM, Field CB. 2008. A unifying framework for
690
dinitrogen fixation in the terrestrial biosphere. Nature 454:327.
691
21. Werner C, Butterbach‐Bahl K, Haas E, Hickler T, Kiese R. 2007. A global inventory of
692
N2O emissions from tropical rainforest soils using a detailed biogeochemical model.
693
Global Biogeochemical Cycles 21:GB3010.
694
22. Giles M, Morley N, Baggs EM, Daniell TJ. 2012. Soil nitrate reducing processes
695
drivers, mechanisms for spatial variation, and significance for nitrous oxide production.
696
Front Microbiol 3:407.
697
23. Onley JR, Ahsan S, Sanford RA, Löffler FE. 2018. Denitrification by Anaeromyxobacter
698
dehalogenans, a Common Soil Bacterium Lacking the Nitrite Reductase Genes nirS and
699
nirK. Applied and Environmental Microbiology 84.
700
24. Braker G, Tiedje JM. 2003. Nitric Oxide Reductase (norB) Genes from Pure Cultures and
701
Environmental Samples. Appl Environ Microbiol 69:3476-3483.
702
25. Richardson D, Felgate H, Watmough N, Thomson A, Baggs E. 2009. Mitigating release
703
of the potent greenhouse gas N2O from the nitrogen cycle could enzymic regulation
704
hold the key? Trends Biotechnol 27:388-397.
705
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
34
26. Spiro S. 2012. Nitrous oxide production and consumption: regulation of gene expression
706
by gas-sensitive transcription factors. Philos Trans R Soc Lond B Biol Sci 367:1213-
707
1225.
708
27. Black A, Hsu PCL, Hamonts KE, Clough TJ, Condron LM. 2016. Influence of copper on
709
expression of nirS, norB and nosZ and the transcription and activity of NIR, NOR and
710
N(2). Microb Biotechnol 9:381-388.
711
28. Garbeva P, Baggs EM, Prosser JI. 2007. Phylogeny of nitrite reductase (nirK) and nitric
712
oxide reductase (norB) genes from Nitrosospira species isolated from soil. FEMS
713
Microbiol Lett 266:83-9.
714
29. Zumft WG. 2005. Nitric oxide reductases of prokaryotes with emphasis on the
715
respiratory, hemecopper oxidase type. J Inorg Biochem 99:194-215.
716
30. Higgins SA, Schadt CW, Matheny PB, Löffler FE. 2018. Phylogenomics Reveal the
717
Dynamic Evolution of Fungal Nitric Oxide Reductases and Their Relationship to
718
Secondary Metabolism. Genome Biology and Evolution 10:2474-2489.
719
31. Braker G, Conrad R. 2011. Diversity, structure, and size of N(2)O-producing microbial
720
communities in soils--what matters for their functioning? Adv Appl Microbiol 75:33-70.
721
32. Hallin S, Philippot L, Löffler FE, Sanford RA, Jones CM. Genomics and Ecology of
722
Novel N<sub>2</sub>O-Reducing Microorganisms. Trends in Microbiology 26:43-55.
723
33. Richardson D, Felgate H, Watmough N, Thomson A, Baggs E. Mitigating release of the
724
potent greenhouse gas N<sub>2</sub>O from the nitrogen cycle &#x2013; could
725
enzymic regulation hold the key? Trends in Biotechnology 27:388-397.
726
34. Hallin S, Philippot L, Löffler FE, Sanford RA, Jones CM. 2018. Genomics and Ecology
727
of Novel N2O-Reducing Microorganisms. Trends Microbiol 26:43-55.
728
35. Zhang J, Cai Z, Cheng Y, Zhu T. 2009. Denitrification and total nitrogen gas production
729
from forest soils of Eastern China. Soil Biol Biochem 41:2551-2557.
730
36. Orr CH, James A, Leifert C, Cooper JM, Cummings SP. 2011. Diversity and Activity of
731
Free-Living Nitrogen-Fixing Bacteria and Total Bacteria in Organic and Conventionally
732
Managed Soils. Applied and Environmental Microbiology 77:911-919.
733
37. Townsend Alan R, Cleveland Cory C, Houlton Benjamin Z, Alden Caroline B, White
734
James WC. 2011. Multi‐element regulation of the tropical forest carbon cycle. Frontiers
735
in Ecology and the Environment 9:9-17.
736
38. Hedin LO, Brookshire ENJ, Menge DNL, Barron AR. 2005. The Nitrogen Paradox in
737
Tropical Forest Ecosystems. Annu Rev Ecol Evol Syst 40:613-635.
738
39. Brookshire EN, Gerber S, Menge DN, Hedin LO. 2012. Large losses of inorganic
739
nitrogen from tropical rainforests suggest a lack of nitrogen limitation. Ecol Lett 15:9-16.
740
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
35
40. S. Brown AEL, S. Silander, L. Liegel. 1983. Research history and opportunities in the
741
Luquillo Experimental Forest. Tech Rep SO-44, US Forest Service p. 128.
742
41. Gould WA, Gonzalez G, Rivera GC. 2006. Structure and Composition of Vegetation
743
along an Elevational Gradient in Puerto Rico. J Veg Sci 17:653-664.
744
42. Aide TM, Zimmerman JK, Herrera L, Rosario M, Serrano M. 1995. Forest recovery in
745
abandoned tropical pastures in Puerto Rico. Forest Ecology and Management 77:77-86.
746
43. Weaver PLG, W. A. 2013. Forest vegetation along environmental gradients in
747
northeastern Puerto Rico. Pages 43-66 in G. González, M. R. Willig, and R. B. Waide,
748
editors. Ecological gradient analyses in a tropical landscape. Ecological Bulletins 54.
749
Wiley-Blackwell, Hoboken, NJ. 2013.
750
44. Ping C-LM, G. J.; Stiles, C. A.; Gonzalez, G. 2013. Soil characteristics, carbon stores,
751
and nutrient distribution in eight forest types along an elevational gradient, eastern Puerto
752
Rico. Pages 67-86 in G. González, M. R. Willig, and R. B. Waide, editors. Ecological
753
gradient analyses in a tropical landscape. Ecological Bulletins 54. Wiley-Blackwell,
754
Hoboken, NJ. 2013.
755
45. González G, R. Willig M, Waide R. 2013. Ecological gradient analyses in a tropical
756
landscape: multiple perspectives and emerging themes, vol 54.
757
46. González G, Lodge DJ. 2017. Soil Biology Research across Latitude, Elevation and
758
Disturbance Gradients: A Review of Forest Studies from Puerto Rico during the Past 25
759
Years. Forests 8.
760
47. Van Beusekom AE, González G, Rivera MM. 2014. Short-Term Precipitation and
761
Temperature Trends along an Elevation Gradient in Northeastern Puerto Rico. Earth
762
Interactions 19:1-33.
763
48. Liptzin D, Silver Whendee L. 2015. Spatial patterns in oxygen and redox sensitive
764
biogeochemistry in tropical forest soils. Ecosphere 6:1-14.
765
49. Hall SJ, Liptzin D, Buss HL, DeAngelis K, Silver WL. 2016. Drivers and patterns of iron
766
redox cycling from surface to bedrock in a deep tropical forest soil: a new conceptual
767
model. Biogeochemistry 130:177-190.
768
50. DeAngelis KM, Chivian D, Fortney JL, Arkin AP, Simmons B, Hazen TC, Silver WL.
769
2013. Changes in microbial dynamics during long-term decomposition in tropical forests.
770
Soil Biol Biochem 66:60-68.
771
51. Waldrop MP, Balser TC, Firestone MK. 2000. Linking microbial community
772
composition to function in a tropical soil. Soil Biol Biochem 32:1837-1846.
773
52. Templer Pamela H, Silver Whendee L, Pett-Ridge J, Kristen MD, Firestone Mary K.
774
2008. Plant and microbial controls on nitrogen retention and loss in a humid tropical
775
forest. Ecology 89:3030-3040.
776
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
36
53. Cantrell SA, Lodge DJ, Cruz CA, Garcia LM, Perez-Jimenez JR, Molina M. 2013.
777
Differential abundance of microbial functional groups along the elevation gradient from
778
the coast to the Luquillo Mountains, p 87-100, Ecological Bulletins 54, vol 54.
779
54. Lammel DR, Feigl BJ, Cerri CC, Nüsslein K. 2015. Specific microbial gene abundances
780
and soil parameters contribute to C, N, and greenhouse gas process rates after land use
781
change in Southern Amazonian Soils. Front Microbiol 6:1057.
782
55. Jousset A, Bienhold C, Chatzinotas A, Gallien L, Gobet A, Kurm V, Küsel K, Rillig MC,
783
Rivett DW, Salles JF, van der Heijden MGA, Youssef NH, Zhang X, Wei Z, Hol WHG.
784
2017. Where less may be more: how the rare biosphere pulls ecosystems strings. ISME J
785
11:853.
786
56. Orellana LH, Chee-Sanford JC, Sanford RA, Löffler FE, Konstantinidis KT. 2017. Year-
787
round shotgun metagenomes reveal stable microbial communities in agricultural soils and
788
novel ammonia oxidizers responding to fertilization. Applied and Environmental
789
Microbiology.
790
57. Johnston ER, Rodriguez-R LM, Luo C, Yuan MM, Wu L, He Z, Schuur EAG, Luo Y,
791
Tiedje JM, Zhou J, Konstantinidis KT. 2016. Metagenomics Reveals Pervasive Bacterial
792
Populations and Reduced Community Diversity across the Alaska Tundra Ecosystem.
793
Front Microbiol 7:579.
794
58. Caro-Quintero A, Konstantinidis KT. 2012. Bacterial species may exist, metagenomics
795
reveal. Environ Microbiol 14:347-355.
796
59. Luo C, Rodriguez RL, Johnston ER, Wu L, Cheng L, Xue K, Tu Q, Deng Y, He Z, Shi
797
JZ, Yuan MM, Sherry RA, Li D, Luo Y, Schuur EA, Chain P, Tiedje JM, Zhou J,
798
Konstantinidis KT. 2014. Soil microbial community responses to a decade of warming as
799
revealed by comparative metagenomics. Appl Environ Microbiol 80:1777-86.
800
60. Rodriguez-R LM, Konstantinidis KT. 2014. Nonpareil: a redundancy-based approach to
801
assess the level of coverage in metagenomic datasets. Bioinformatics 30:629-635.
802
61. Zhang X, Johnston ER, Li L, Konstantinidis KT, Han X. 2017. Experimental warming
803
reveals positive feedbacks to climate change in the Eurasian Steppe. ISME J 11:885-895.
804
62. Rodriguez-R LM, Konstantinidis KT. 2016. The enveomics collection: a toolbox for
805
specialized analyses of microbial genomes and metagenomes. PeerJ Preprints 4:e1900v1.
806
63. Rodriguez RL, Gunturu S, Harvey WT, Rossello-Mora R, Tiedje JM, Cole JR,
807
Konstantinidis KT. 2018. The Microbial Genomes Atlas (MiGA) webserver: taxonomic
808
and gene diversity analysis of Archaea and Bacteria at the whole genome level. Nucleic
809
Acids Res 46:W282-w288.
810
64. Bruijnzeel LA, Proctor J. Hydrology and Biogeochemistry of Tropical Montane Cloud
811
Forests: What Do We Really Know?, p 38-78. In Hamilton LS, Juvik JO, Scatena FN
812
(ed), Springer US,
813
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
37
65. Song L, Lu H-Z, Xu X-L, Li S, Shi X-M, Chen X, Wu Y, Huang J-B, Chen Q, Liu S, Wu
814
C-S, Liu W-Y. 2016. Organic nitrogen uptake is a significant contributor to nitrogen
815
economy of subtropical epiphytic bryophytes. Scientific Reports 6:30408.
816
66. Cusack DF, Silver W, McDowell WH. 2009. Biological Nitrogen Fixation in Two
817
Tropical Forests: Ecosystem-Level Patterns and Effects of Nitrogen Fertilization.
818
Ecosystems 12:1299-1315.
819
67. Hedin LO, Brookshire ENJ, Menge DNL, Barron AR. 2009. The Nitrogen Paradox in
820
Tropical Forest Ecosystems. Annual Review of Ecology, Evolution, and Systematics
821
40:613-635.
822
68. Sanford RA, Wagner DD, Wu Q, Chee-Sanford JC, Thomas SH, Cruz-García C,
823
Rodríguez G, Massol-Deyá A, Krishnani KK, Ritalahti KM, Nissen S, Konstantinidis
824
KT, Löffler FE. 2012. Unexpected nondenitrifier nitrous oxide reductase gene diversity
825
and abundance in soils. Proceedings of the National Academy of Sciences 109:19709.
826
69. Johnston MH. 1992. Soil-Vegetation Relationships in a Tabonuco Forest Community in
827
the Luquillo Mountains of Puerto Rico. Journal of Tropical Ecology 8:253-263.
828
70. Weaver PL. 1995. The Colorado and Dwarf Forests of Puerto Rico’s Luquillo Mountains,
829
p 109-141. In Lugo AE, Lowe C (ed), Tropical Forests: Management and Ecology
830
doi:10.1007/978-1-4612-2498-3_5. Springer New York, New York, NY.
831
71. Rodríguez-Minguela CM, Apajalahti JHA, Chai B, Cole JR, Tiedje JM. 2009. Worldwide
832
Prevalence of Class 2 Integrases outside the Clinical Setting Is Associated with Human
833
Impact. Applied and Environmental Microbiology 75:5100-5110.
834
72. Griffiths RI, Whiteley AS, O'Donnell AG, Bailey MJ. 2000. Rapid method for
835
coextraction of DNA and RNA from natural environments for analysis of ribosomal
836
DNA- and rRNA-based microbial community composition. Applied and environmental
837
microbiology 66:5488-5491.
838
73. Tsai YL, Olson BH. 1991. Rapid method for direct extraction of DNA from soil and
839
sediments. Applied and environmental microbiology 57:1070-1074.
840
74. Cox MP, Peterson DA, Biggs PJ. 2010. SolexaQA: At-a-glance quality assessment of
841
Illumina second-generation sequencing data. BMC Bioinformatics 11:485.
842
75. Peng Y, Leung HC, Yiu SM, Chin FY. 2012. IDBA-UD: a de novo assembler for single-
843
cell and metagenomic sequencing data with highly uneven depth. Bioinformatics
844
28:1420-8.
845
76. Zhu W, Lomsadze A, Borodovsky M. 2010. Ab initio gene identification in metagenomic
846
sequences. Nucleic Acids Res 38:e132.
847
77. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL.
848
2009. BLAST+: architecture and applications. BMC Bioinformatics 10:421-421.
849
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
38
78. Wu Y-W, Tang Y-H, Tringe SG, Simmons BA, Singer SW. 2014. MaxBin: an automated
850
binning method to recover individual genomes from metagenomes using an expectation-
851
maximization algorithm. Microbiome 2:26.
852
79. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. 2015. CheckM:
853
assessing the quality of microbial genomes recovered from isolates, single cells, and
854
metagenomes. Genome Research 25:1043-1055.
855
80. Konstantinidis KT, DeLong EF. 2008. Genomic patterns of recombination, clonal
856
divergence and environment in marine microbial populations. ISME J
857
doi:http://www.nature.com/ismej/journal/v2/n10/suppinfo/ismej200862s1.html.
858
81. Bairoch A, Apweiler R. 2000. The SWISS-PROT protein sequence database and its
859
supplement TrEMBL in 2000. Nucleic Acids Res 28:45-48.
860
82. Overbeek R, Olson R, Pusch GD, Olsen GJ, Davis JJ, Disz T, Edwards RA, Gerdes S,
861
Parrello B, Shukla M, Vonstein V, Wattam AR, Xia F, Stevens R. 2014. The SEED and
862
the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).
863
Nucleic Acids Res 42:D206-D214.
864
83. Rho M, Tang H, Ye Y. 2010. FragGeneScan: predicting genes in short and error-prone
865
reads. Nucleic Acids Res 38:e191-e191.
866
84. Ondov BD, Treangen TJ, Melsted P, Mallonee AB, Bergman NH, Koren S, Phillippy
867
AM. 2016. Mash: fast genome and metagenome distance estimation using MinHash.
868
Genome Biology 17:132.
869
85. Oksanen J. 2011. Multivariate analysis of ecological communities in R: vegan tutorial
870
v.2.02. http://ccoulufi/~jarioksa/opetus/metodi/vegantutorpdf.
871
86. Su X, Pan W, Song B, Xu J, Ning K. 2014. Parallel-META 2.0: Enhanced Metagenomic
872
Data Analysis with Functional Annotation, High Performance Computing and Advanced
873
Visualization. PLOS ONE 9:e89323.
874
87. Edgar RC. 2010. Search and clustering orders of magnitude faster than BLAST.
875
Bioinformatics 26:2460-2461.
876
88. McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, Andersen
877
GL, Knight R, Hugenholtz P. 2012. An improved Greengenes taxonomy with explicit
878
ranks for ecological and evolutionary analyses of bacteria and archaea. The ISME Journal
879
6:610-618.
880
89. Kuczynski J, Stombaugh J, Walters WA, González A, Caporaso JG, Knight R. 2011.
881
Using QIIME to analyze 16S rRNA gene sequences from Microbial Communities.
882
Current protocols in bioinformatics / editoral board, Andreas D Baxevanis [et al]
883
CHAPTER:Unit10.7-Unit10.7.
884
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
39
90. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer
885
N, Peña AG, Goodrich JK, Gordon JI, Huttley GA, Kelley ST, Knights D, Koenig JE,
886
Ley RE, Lozupone CA, McDonald D, Muegge BD, Pirrung M, Reeder J, Sevinsky JR,
887
Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T, Zaneveld J, Knight R. 2010.
888
QIIME allows analysis of high-throughput community sequencing data. Nat Methods
889
7:335.
890
91. Orellana LH, Rodriguez-R LM, Konstantinidis KT. 2017. ROCker: accurate detection
891
and quantification of target genes in short-read metagenomic data sets by modeling
892
sliding-window bitscores. Nucleic Acids Res 45:e14-e14.
893
92. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H,
894
Remmert M, Söding J, Thompson JD, Higgins DG. 2011. Fast, scalable generation of
895
high‐quality protein multiple sequence alignments using Clustal Omega. Molecular
896
Systems Biology 7.
897
93. Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis
898
of large phylogenies. Bioinformatics 30:1312-1313.
899
94. Katoh K, Misawa K, Kuma K-i, Miyata T. 2002. MAFFT: a novel method for rapid
900
multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res
901
30:3059-3066.
902
95. Letunic I, Bork P. 2016. Interactive tree of life (iTOL) v3: an online tool for the display
903
and annotation of phylogenetic and other trees. Nucleic Acids Res 44:W242-5.
904
96. Luo C, Rodriguez-R LM, Konstantinidis KT. 2014. MyTaxa: an advanced taxonomic
905
classifier for genomic and metagenomic sequences. Nucleic Acids Res 42:e73-e73.
906
907
.CC-BY-ND 4.0 International license(which was not certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprintthis version posted June 16, 2020. . https://doi.org/10.1101/2020.06.15.153866doi: bioRxiv preprint
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Fungi expressing P450nor, an unconventional nitric oxide (NO) reducing cytochrome P450, are considered significant contributors to environmental nitrous oxide (N2O) emissions. Despite extensive efforts, fungal contributions to N2O emissions remain uncertain. For example, the majority of N2O emitted from antibiotic-amended soil microcosms is attributed to fungal activity, yet axenic fungal cultures do not couple N-oxyanion respiration to growth and these fungi produce only minor quantities of N2O. To assist in reconciling these conflicting observations and produce a benchmark genomic analysis of fungal denitrifiers, genes underlying denitrification were examined in > 700 fungal genomes. Of 167 p450nor-containing genomes identified, 0, 30, and 48 also harbored the denitrification genes narG, napA or nirK, respectively. Compared to napA and nirK, p450nor was twice as abundant and exhibited two to five-fold more gene duplications, losses, and transfers, indicating a disconnect between p450nor presence and denitrification potential. Furthermore, co-occurrence of p450nor with genes encoding NO-detoxifying flavohemoglobins (Spearman r = 0.87, p = 1.6e-10) confounds hypotheses regarding P450nor's primary role in NO detoxification. Instead, ancestral state reconstruction united P450nor with actinobacterial cytochrome P450s (CYP105) involved in secondary metabolism (SM) and 19 (11%) p450nor-containing genomic regions were predicted to be SM clusters. Another 40 (24%) genomes harbored genes nearby p450nor predicted to encode hallmark SM functions, providing additional contextual evidence linking p450nor to SM. These findings underscore the potential physiological implications of widespread p450nor gene transfer, support the undiscovered affiliation of p450nor with fungal SM, and challenge the hypothesis of p450nor's primary role in denitrification.
Article
Full-text available
The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of prokaryotic species and communities but it offers limited resolution at the species and finer levels, and cannot represent the whole-genome diversity and fluidity. To overcome these limitations, we introduced the Microbial Genomes Atlas (MiGA), a webserver that allows the classification of an unknown query genomic sequence, complete or partial, against all taxonomically classified taxa with available genome sequences, as well as comparisons to other related genomes including uncultivated ones, based on the genome-aggregate Average Nucleotide and Amino Acid Identity (ANI/AAI) concepts. MiGA integrates best practices in sequence quality trimming and assembly and allows input to be raw reads or assemblies from isolate genomes, single-cell sequences, and metagenome-assembled genomes (MAGs). Further, MiGA can take as input hundreds of closely related genomes of the same or closely related species (a so-called 'Clade Project') to assess their gene content diversity and evolutionary relationships, and calculate important clade properties such as the pangenome and core gene sets. Therefore, MiGA is expected to facilitate a range of genome-based taxonomic and diversity studies, and quality assessment across environmental and clinical settings. MiGA is available at http://microbial-genomes.org/.
Article
Full-text available
The immense diversity of soil bacterial communities has stymied efforts to characterize individual taxa and document their global distributions. We analyzed soils from 237 locations across six continents and found that only 2% of bacterial phylotypes (~500 phylotypes) consistently accounted for almost half of the soil bacterial communities worldwide. Despite the overwhelming diversity of bacterial communities, relatively few bacterial taxa are abundant in soils globally. We clustered these dominant taxa into ecological groups to build the first global atlas of soil bacterial taxa. Our study narrows down the immense number of bacterial taxa to a “most wanted” list that will be fruitful targets for genomic and cultivation-based efforts aimed at improving our understanding of soil microbes and their contributions to ecosystem functioning.
Article
Full-text available
The versatile soil bacterium Anaeromyxobacter dehalogenans lacks the hallmark denitrification genes nirS and nirK (encoding NO2⁻→NO reductases) and couples growth to NO3⁻ reduction to NH4⁺ (respiratory ammonification) and to N2O reduction to N2. A. dehalogenans also grows by reducing Fe(III) to Fe(II), which chemically reacts with NO2⁻ to form N2O (i.e., chemodenitrification). Following the addition of 100 μmol of NO3⁻ or NO2⁻ to Fe(III)-grown axenic cultures of A. dehalogenans, 54 (±7) μmol and 113 (±2) μmol N2O-N, respectively, were produced and subsequently consumed. The conversion of NO3⁻ to N2 in the presence of Fe(II) through linked biotic-abiotic reactions represents an unrecognized ecophysiology of A. dehalogenans. The new findings demonstrate that the assessment of gene content alone is insufficient to predict microbial denitrification potential and N loss (i.e., the formation of gaseous N products). A survey of complete bacterial genomes in the NCBI Reference Sequence database coupled with available physiological information revealed that organisms lacking nirS or nirK but with Fe(III) reduction potential and genes for NO3⁻ and N2O reduction are not rare, indicating that NO3⁻ reduction to N2 through linked biotic-abiotic reactions is not limited to A. dehalogenans. Considering the ubiquity of iron in soils and sediments and the broad distribution of dissimilatory Fe(III) and NO3⁻ reducers, denitrification independent of NO-forming NO2⁻ reductases (through combined biotic-abiotic reactions) may have substantial contributions to N loss and N2O flux. IMPORTANCE Current attempts to gauge N loss from soils rely on the quantitative measurement of nirK and nirS genes and/or transcripts. In the presence of iron, the common soil bacterium Anaeromyxobacter dehalogenans is capable of denitrification and the production of N2 without the key denitrification genes nirK and nirS. Such chemodenitrifiers denitrify through combined biotic and abiotic reactions and have potentially large contributions to N loss to the atmosphere and fill a heretofore unrecognized ecological niche in soil ecosystems. The findings emphasize that the comprehensive understanding of N flux and the accurate assessment of denitrification potential can be achieved only when integrated studies of interlinked biogeochemical cycles are performed.
Article
Full-text available
The dynamics of individual microbial populations and their gene functions in agricultural soils, especially after major activities such as nitrogen (N) fertilization, remain elusive but are important for a better understanding of nutrient cycling. Here, we analyzed 20 short-read metagenomes collected at four time points during 1 year from two depths (0 to 5 and 20 to 30 cm) in two Midwestern agricultural sites representing contrasting soil textures (sandy versus silty loam) with similar cropping histories. Although the microbial community taxonomic and functional compositions differed between the two locations and depths, they were more stable within a depth/site throughout the year than communities in natural aquatic ecosystems. For example, among the 69 population genomes assembled from the metagenomes, 75% showed a less than 2-fold change in abundance between any two sampling points. Interestingly, six deep-branching Thaumarchaeota and three complete ammonia oxidizer (comammox) Nitrospira populations increased up to 5-fold in abundance upon the addition of N fertilizer. These results indicated that indigenous archaeal ammonia oxidizers may respond faster (are more copiotrophic) to N fertilization than previously thought. None of 29 recovered putative denitrifier genomes encoded the complete denitrification pathway, suggesting that denitrification is carried out by a collection of different populations. Altogether, our study identified novel microbial populations and genes responding to seasonal and human-induced perturbations in agricultural soils that should facilitate future monitoring efforts and N-related studies. IMPORTANCE Even though the impact of agricultural management on the microbial community structure has been recognized, an understanding of the dynamics of individual microbial populations and what functions each population carries are limited. Yet, this information is important for a better understanding of nutrient cycling, with potentially important implications for preserving nitrogen in soils and sustainability. Here, we show that reconstructed metagenome-assembled genomes (MAGs) are relatively stable in their abundance and functional gene content year round, and seasonal nitrogen fertilization has selected for novel Thaumarchaeota and comammox Nitrospira nitrifiers that are potentially less oligotrophic than their marine counterparts previously studied.
Article
Full-text available
Progress in understanding changes in soil biology in response to latitude, elevation and disturbance gradients has generally lagged behind studies of above-ground plants and animals owing to methodological constraints and high diversity and complexity of interactions in below-ground food webs. New methods have opened research opportunities in below-ground systems, leading to a rapid increase in studies of below-ground organisms and processes. Here, we summarize results of forest soil biology research over the past 25 years in Puerto Rico as part of a 75th Anniversary Symposium on research of the USDA Forest Service International Institute of Tropical Forestry. These results are presented in the context of changes in soil and forest floor biota across latitudinal, elevation and disturbance gradients. Invertebrate detritivores in these tropical forests exerted a stronger influence on leaf decomposition than in cold temperate forests using a common substrate. Small changes in arthropods brought about using different litterbag mesh sizes induced larger changes in leaf litter mass loss and nutrient mineralization. Fungi and bacteria in litter and soil of wet forests were surprisingly sensitive to drying, leading to changes in nutrient cycling. Tropical fungi also showed sensitivity to environmental fluctuations and gradients as fungal phylotype composition in soil had a high turnover along an elevation gradient in Puerto Rico. Globally, tropical soil fungi had smaller geographic ranges than temperate fungi. Invertebrate activity accelerates decomposition of woody debris, especially in lowland dry forest, but invertebrates are also important in early stages of log decomposition in middle elevation wet forests. Large deposits of scoltine bark beetle frass from freshly fallen logs coincide with nutrient immobilization by soil microbial biomass and a relatively low density of tree roots in soil under newly fallen logs. Tree roots shifted their foraging locations seasonally in relation to decaying logs. Native earthworms were sensitive to disturbance and were absent from tree plantations, whereas introduced earthworms were found across elevation and disturbance gradients.
Article
Full-text available
Rare species are increasingly recognized as crucial, yet vulnerable components of Earth's ecosystems. This is also true for microbial communities, which are typically composed of a high number of relatively rare species. Recent studies have demonstrated that rare species can have an over-proportional role in biogeochemical cycles and may be a hidden driver of microbiome function. In this review, we provide an ecological overview of the rare microbial biosphere, including causes of rarity and the impacts of rare species on ecosystem functioning. We discuss how rare species can have a preponderant role for local biodiversity and species turnover with rarity potentially bound to phylogenetically conserved features. Rare microbes may therefore be overlooked keystone species regulating the functioning of host-associated, terrestrial and aquatic environments. We conclude this review with recommendations to guide scientists interested in investigating this rapidly emerging research area.The ISME Journal advance online publication, 10 January 2017; doi:10.1038/ismej.2016.174.
Chapter
Temperate and boreal forest ecosystems cover approximately 13% of the world terrestrial surface and provide a wide range of ecological services to society, including a significant contribution to the regulation of atmospheric greenhouse gas concentrations. Forests do not only function as major sinks (and sources) for atmospheric carbon dioxide (CO2) but also as significant sources and sinks of other atmospheric greenhouse gases, namely, nitrous oxide (N2O) and methane (CH4). The importance of forests as regulators of atmospheric concentrations of these trace gases is undebated, but how this function might change in view of the ongoing climate and associated environmental changes remains a matter of debate. On the one hand, increases in temperature and atmospheric CO2 could lead to permafrost thaw, dramatically increasing N transformation rates in the soil and associated N2O emissions. On the other hand, declining precipitation or changes toward more episodic rainfall events might result in the opposite, through reduced N2O efflux and stimulated uptake of atmospheric CH4 by forest soils. By providing a set of examples from field and laboratory studies, we present the current knowledge and the research perspectives aiming at a better understanding of the current and future role of boreal and temperate forest soils as regulators of the atmospheric concentrations of N2O and CH4 in the frame of global change.
Article
Soil microorganisms are clearly a key component of both natural and managed ecosystems. Despite the challenges of surviving in soil, a gram of soil can contain thousands of individual microbial taxa, including viruses and members of all three domains of life. Recent advances in marker gene, genomic and metagenomic analyses have greatly expanded our ability to characterize the soil microbiome and identify the factors that shape soil microbial communities across space and time. However, although most soil microorganisms remain undescribed, we can begin to categorize soil microorganisms on the basis of their ecological strategies. This is an approach that should prove fruitful for leveraging genomic information to predict the functional attributes of individual taxa. The field is now poised to identify how we can manipulate and manage the soil microbiome to increase soil fertility, improve crop production and improve our understanding of how terrestrial ecosystems will respond to environmental change.
Article
Microorganisms withthecapacitytoreducethegreenhousegasnitrousoxide (N2O) toharmlessdinitrogengasarereceivingincreasedattentiondueto increasing N2O emissions(andourneedtomitigateclimatechange)andto recent discoveriesofnovelN2O-reducing bacteriaandarchaea.Thediversityof denitrifying andnondenitrifyingmicroorganismswithcapacityforN2O reduc- tion wasrecentlyshowntobegreaterthanpreviouslyexpected.Aformerly overlooked group(cladeII)intheenvironmentincludealargefractionofnon- denitrifying N2O reducers,whichcouldbeN2O sinkswithoutmajorcontribution to N2O formation.Wereviewtherecentadvancesaboutfundamentalunder- standing ofthegenomics,physiology,andecologyofN2O reducersandthe importance ofthese findings forcurbingN2O emissions.