Nikos C Kyrpides’s research while affiliated with Lawrence Berkeley National Laboratory and other places
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
The Genomes OnLine Database (GOLD; https://gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute is a comprehensive online metadata repository designed to catalog and manage information related to (meta)genomic sequence projects. GOLD provides a centralized platform where researchers can access a wide array of metadata from its four organization levels namely Study, Organism/Biosample, Sequencing Project and Analysis Project. GOLD continues to serve as a valuable resource and has seen significant growth and expansion since its inception in 1997. With its expanded role as a collaborative platform, it not only actively imports data from other primary repositories like National Center for Biotechnology Information but also supports contributions from researchers worldwide. This collaborative approach has enriched the database with diverse datasets, creating a more integrated resource to enhance scientific insights. As genomic research becomes increasingly integral to various scientific disciplines, more researchers and institutions are turning to GOLD for their metadata needs. To meet this growing demand, GOLD has expanded by adding diverse metadata fields, intuitive features, advanced search capabilities and enhanced data visualization tools, making it easier for users to find and interpret relevant information. This manuscript provides an update and highlights the new features introduced over the last 2 years.
The North Temperate Lakes Long-Term Ecological Research (NTL-LTER) program has been extensively used to improve understanding of how aquatic ecosystems respond to environmental stressors, climate fluctuations, and human activities. Here, we report on the metagenomes of samples collected between 2000 and 2019 from Lake Mendota, a freshwater eutrophic lake within the NTL-LTER site. We utilized the distributed metagenome assembler MetaHipMer to coassemble over 10 terabases (Tbp) of data from 471 individual Illumina-sequenced metagenomes. A total of 95,523,664 contigs were assembled and binned to generate 1,894 non-redundant metagenome-assembled genomes (MAGs) with ≥50% completeness and ≤10% contamination. Phylogenomic analysis revealed that the MAGs were nearly exclusively bacterial, dominated by Pseudomonadota (Proteobacteria, N = 623) and Bacteroidota (N = 321). Nine eukaryotic MAGs were identified by eukCC with six assigned to the phylum Chlorophyta. Additionally, 6,350 high-quality viral sequences were identified by geNomad with the majority classified in the phylum Uroviricota. This expansive coassembled metagenomic dataset provides an unprecedented foundation to advance understanding of microbial communities in freshwater ecosystems and explore temporal ecosystem dynamics.
Historically neglected by microbial ecologists, soil viruses are now thought to be critical to global biogeochemical cycles. However, our understanding of their global distribution, activities and interactions with the soil microbiome remains limited. Here we present the Global Soil Virus Atlas, a comprehensive dataset compiled from 2,953 previously sequenced soil metagenomes and composed of 616,935 uncultivated viral genomes and 38,508 unique viral operational taxonomic units. Rarefaction curves from the Global Soil Virus Atlas indicate that most soil viral diversity remains unexplored, further underscored by high spatial turnover and low rates of shared viral operational taxonomic units across samples. By examining genes associated with biogeochemical functions, we also demonstrate the viral potential to impact soil carbon and nutrient cycling. This study represents an extensive characterization of soil viral diversity and provides a foundation for developing testable hypotheses regarding the role of the virosphere in the soil microbiome and global biogeochemistry.
The fields of Metagenomics and Metatranscriptomics involve the examination of complete nucleotide sequences, gene identification, and analysis of potential biological functions within diverse organisms or environmental samples. Despite the vast opportunities for discovery in metagenomics, the sheer volume and complexity of sequence data often present challenges in processing analysis and visualization. This article highlights the critical role of advanced visualization tools in enabling effective exploration, querying, and analysis of these complex datasets. Emphasizing the importance of accessibility, the article categorizes various visualizers based on their intended applications and highlights their utility in empowering bioinformaticians and non-bioinformaticians to interpret and derive insights from meta-omics data effectively.
A comprehensive microbial surveillance was conducted at NASA’s Mars 2020 spacecraft assembly facility (SAF), where whole-genome sequencing (WGS) of 110 bacterial strains was performed. One isolate, designated 179-BFC-A-HS T , exhibited less than 80% average nucleotide identity (ANI) to known species, suggesting a novel organism. This strain demonstrated high-level resistance [minimum inhibitory concentration (MIC) >256 mg/L] to third-generation cephalosporins, including ceftazidime, cefpodoxime, combination ceftazidime/avibactam, and the fourth-generation cephalosporin cefepime. The results of a comparative genomic analysis revealed that 179-BFC-A-HS T is most closely related to Virgibacillus halophilus 5B73C T , sharing an ANI of 78.7% and a digital DNA-DNA hybridization (dDDH) value of 23.5%, while their 16S rRNA gene sequences shared 97.7% nucleotide identity. Based on these results and the recent recognition that the genus Virgibacillus is polyphyletic, strain 179-BFC-A-HS T is proposed as a novel species of a novel genus, Tigheibacillus jepli gen. nov., sp. nov (type strain 179-BFC-A-HS T = DSM 115946 T = NRRL B-65666 T ), and its closest neighbor, V. halophilus , is proposed to be reassigned to this genus as Tigheibacillus halophilus comb. nov. (type strain 5B73C T = DSM 21623 T = JCM 21758 T = KCTC 13935 T ). It was also necessary to reclassify its second closest neighbor Virgibacillus soli, as a member of a novel genus Paracerasibacillus , reflecting its phylogenetic position relative to the genus Cerasibacillus , for which we propose Paracerasibacillus soli comb. nov. (type strain CC-YMP-6 T = DSM 22952 T = CCM 7714 T ). Within Amphibacillaceae ( n = 64), P. soli exhibited 11 antibiotic resistance genes (ARG), while T. jepli encoded for 3, lacking any known β-lactamases, suggesting resistance from variant penicillin-binding proteins, disrupting cephalosporin efficacy. P. soli was highly resistant to azithromycin (MIC >64 mg/L) yet susceptible to cephalosporins and penicillins.
IMPORTANCE
The significance of this research extends to understanding microbial survival and adaptation in oligotrophic environments, such as those found in SAF. Whole-genome sequencing of several strains isolated from Mars 2020 mission assembly cleanroom facilities, including the discovery of the novel species Tigheibacillus jepli , highlights the resilience and antimicrobial resistance (AMR) in clinically relevant antibiotic classes of microbes in nutrient-scarce settings. The study also redefines the taxonomic classifications within the Amphibacillaceae family, aligning genetic identities with phylogenetic data. Investigating ARG and virulence factors (VF) across these strains illuminates the microbial capability for resistance under resource-limited conditions while emphasizing the role of human-associated VF in microbial survival, informing sterilization practices and microbial management in similar oligotrophic settings beyond spacecraft assembly cleanrooms such as pharmaceutical and medical industry cleanrooms.
This study reports the complete genome sequence of Sphaerochaeta associata GLS2 T (=VKM B-2742 T =DSM 26261 T), which was isolated from a consortium with methanogenic archaeon Methanosarcina mazei JL01. The consortium was collected from permafrost of the Kolyma lowland in Russia. The hybrid approach, combining paired-end Illumina reads with Oxford Nanopore Technologies MinION reads, was used to assemble the genome. The final assembly resulted in a circular chromosome that is 3,554,903 bp long. This high-quality genome assembly serves as a basis for algorithmic pathway reconstruction and postgenomic analysis. To further this research, the genome was imported into research portals for the algorithmic reconstruction of metabolic pathways, in both common sense (KEGG) and with special attention to carbohydrate metabolism (CAZy). These portals offer high-quality workplaces for in-depth studies.
We present six whole community shotgun metagenomic sequencing data sets of two types of biological soil crusts sampled at the ecotone of the Mojave Desert and Colorado Desert in California. These data will help us understand the diversity and function of biocrust microbial communities, which are essential for desert ecosystems.
We present eight metatranscriptomic datasets of light algal and cyanolichen biological soil crusts from the Mojave Desert in response to wetting. These data will help us understand gene expression patterns in desert biocrust microbial communities after they have been reactivated by the addition of water.
In this study, a Gram-stain-positive, non-motile, oxidase- and catalase-negative, rod-shaped, bacterial strain (SG_E_30_P1 T ) that formed light yellow colonies was isolated from a groundwater sample of Sztaravoda spring, Hungary. Based on 16S rRNA phylogenetic and phylogenomic analyses, the strain was found to form a distinct linage within the family Microbacteriaceae . Its closest relatives in terms of near full-length 16S rRNA gene sequences are Salinibacterium hongtaonis MH299814 (97.72 % sequence similarity) and Leifsonia psychrotolerans GQ406810 (97.57 %). The novel strain grows optimally at 20–28 °C, at neutral pH and in the presence of NaCl (1–2 w/v%). Strain SG_E_30_P1 T contains MK-7 and B-type peptidoglycan with diaminobutyrate as the diagnostic amino acid. The major cellular fatty acids are anteiso-C 15 : 0 , iso-C 16 : 0 and iso-C 14 : 0 , and the polar lipid profile is composed of diphosphatidylglycerol and phosphatidylglycerol, as well as an unidentified aminoglycolipid, aminophospholipid and some unidentified phospholipids. The assembled draft genome is a contig with a total length of 2 897 968 bp and a DNA G+C content of 65.5 mol%. Amino acid identity values with it closest relatives with sequenced genomes of <62.54 %, as well as other genome distance results, indicate that this bacterium represents a novel genus within the family Microbacteriaceae . We suggest that SG_E_30_P1 T (=DSM 111415 T =NCAIM B.02656 T ) represents the type strain of a novel genus and species for which the name Antiquaquibacter oligotrophicus gen. nov., sp. nov. is proposed.
Citations (73)
... Recent total metagenome-based studies have advanced our understanding of soil viruses, revealing the vast diversity of viral communities and their functional potentials across various soil ecosystems (Graham et al., 2024;Ma et al., 2024). Although viral size fraction metagenomes (viromes) have demonstrated effectiveness over total metagenomes in exploring the virosphere, their application has been limited to small-scale studies (Santos-Medellin et al., 2021). ...
... Several studies revealed habitat-specific differences in MGE distribution. For example, soil encodes much more plasmid taxonomic units than humans [8]. Similarly, phages show habitat specificity, with some restricted to particular environments [9]. ...
... By bypassing traditional methods, it has uncovered the vast majority of unculturable microorganisms and revealed new metabolic pathways and bioactive compounds [14], as well as the microbial "dark matter" present in diverse environments across the globe [15][16][17] (Figure 1). Pavlopoulos et al. [18] revealed an immense global diversity of previously uncharacterized proteins in global metagenomes by generating reference-free protein families, identifying over 106,000 novel protein clusters with no similarity to known sequences, thereby doubling the number of known protein families and highlighting vastly untapped functional and structural diversity within microbial "dark matter". Yan et al. [19] established a comprehensive global rumen virome database (RVD), identifying 397,180 viral operational taxonomic units (vOTUs) from 975 rumen metagenomes, revealing the previously unexplored viral "dark matter" of the rumen. ...
... In our catalog, only a small fraction are homologous to reference small protein datasets, with the vast majority of the novel small proteins being found in non-humanassociated habitats (Supplementary Fig. 5b). On the other hand, it encompasses most of the known small proteins in either the RefSeq database or in families discovered recently (NMPfamsDB 61 and FesNov families 28 ). When comparing with small protein databases that focus on eukaryotic organisms, such as smProt2 62 , OpenProt2.0 ...
... Latescibacterota and Desulfobacterota are commonly found in low abundance in soils and, notably, both groups happen to harbor bacteria with anaerobic metabolism: anaerobic fermentation in Latescibacterota (Youssef et al., 2015) and Fe(III)-reduction in Geobacteraceae (Megonigal et al., 2003), the most represented family in our study. Crop soils, on the other hand, had a higher relative abundance of Deinococcota (mostly Deinococcaceae), which were more abundant in biocrusts (Wang et al., 2024) and are characterized by their high resistance to UV radiation (Seshadri et al., 2023), consistent with the lower vegetation cover and higher exposure of crop soils to direct sunlight. Prairie restoration also increased the relative abundance of the fungal phylum Glomeromycota [arbuscular mycorrhizal (AM) fungi] (Figure 4), in agreement with AM fungal spore density measurements in these sites and PLFA abundance in other tallgrass prairie restoration studies (Allison et al., 2005;Baer et al., 2015). ...
... CheckV (Table S4) was used for quality control assessment of the phage genomes [32]. Each phage genome was annotated using MetaCerberus (v1.4) [33] using all databases option with Pyrodigal-gv [34] [35]. ...
... Muchos trabajos referentes al microbioma de diversos ambientes y hospederos se están publicando y se describe como la composición microbiana varía en las diferentes condiciones, en especial esas que generan estrés a los hospederos [2, 29,30]. La cantidad de información relacionada con la diversidad microbiana obtenida a partir de estudios metagenómicos es tan grande que sería imposible de analizar sin las herramientas de bioinformática [31]. ...
... We used geNomad (70) with default parameters to classify contigs as plasmids and viruses. Prophages are noted in geNomad output, and these were excluded from the viral contig counts. ...
... Tools such as Kraken2 and Centrifuge recognized for processing longer reads provide enhanced capabilities for microbial sequence analysis. The final stage of metatranscriptomic analysis involves coding transcript annotation by tools like InterProScan, BLAST, RefSeq, Diamond, and HMMER (HMM-based), which were further refined by KEGG, COG, and tools like antiS-MASH for detailed annotation (Baltoumas et al. 2023). The set of tools collectively advances the understanding of microbial functional dynamics in diverse environments. ...
... Similarity search of nodes on information networks serves as the foundation for numerous data analytics techniques [1][2][3][4][5] and has wide-ranging applications, including online advertising [6], recommendation systems [7], biomedical analysis [8][9][10], spatial-temporal systems [11,12]. Most real-world information networks are heterogeneous information networks (HINs) [13,14], characterized by the coexistence of edges connecting nodes of various types showcases objects and relations within an academic network, with "PAP" (Paper-Author-Paper) and "PVP" (Paper-Venue-Paper) as two example metapaths tailored to specific querying needs, are not predefined but are instead designed to capture user search intentions more precisely, thereby meeting user expectations more effectively. ...