Jaime Huerta-Cepas

Jaime Huerta-Cepas
European Molecular Biology Laboratory | EMBL · Structural and Computational Biology Unit (Heidelberg)

PhD

About

117
Publications
44,640
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
31,388
Citations
Citations since 2016
57 Research Items
29009 Citations
201620172018201920202021202202,0004,0006,000
201620172018201920202021202202,0004,0006,000
201620172018201920202021202202,0004,0006,000
201620172018201920202021202202,0004,0006,000
Additional affiliations
January 2014 - March 2016
European Molecular Biology Laboratory
Position
  • PostDoc Position
January 2009 - December 2013
University Pompeu Fabra
Position
  • PostDoc Position
January 2009 - December 2013
Centre for Genomic Regulation
Position
  • PostDoc Position

Publications

Publications (117)
Article
Full-text available
The interpretation of genomic, transcriptomic and other microbial ‘omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of...
Article
Full-text available
The eggNOG (evolutionary gene genealogy Non-supervised Orthologous Groups) database is a bioinformatics resource providing orthology data and comprehensive functional information for organisms from all domains of life. Here, we present a major update of the database and website (version 6.0), which increases the number of covered organisms to 12 53...
Article
Full-text available
Despite advances in sequencing, lack of standardization makes comparisons across studies challenging and hampers insights into the structure and function of microbial communities across multiple habitats on a planetary scale. Here we present a multi-omics analysis of a diverse set of 880 microbial community samples collected for the Earth Microbiom...
Article
Full-text available
Synteny conservation analysis is a well-established methodology to investigate the potential functional role of unknown prokaryotic genes. However, bioinformatic tools to reconstruct and visualise genomic contexts usually depend on slow computations, are restricted to narrow taxonomic ranges, and/or do not allow for the functional and interactive e...
Article
Full-text available
Phylogenomics data have grown exponentially over the last decades. It is currently common for genome-wide projects to generate hundreds or even thousands of phylogenetic trees and multiple sequence alignments, which may also be very large in size. However, the analysis and interpretation of such data still depends on custom bioinformatic and visual...
Preprint
Full-text available
Microbes are the most abundant form of life on Earth and play crucial roles in carbon and nutrient cycling. Despite their crucial role, our understanding of microbial diversity and physiology on the ocean floor is limited. To address this gap in knowledge, we obtained 55 novel bacterial metagenome-assembled genomes (MAGs) from coastal and deep sea...
Preprint
Full-text available
Most microbes on our planet remain uncultured and poorly studied. Recent efforts to catalog their genetic diversity have revealed that a significant fraction of the observed microbial genes are functional and evolutionary untraceable, lacking homologs in reference databases. Despite their potential biological value, these apparently unrelated orpha...
Article
Full-text available
Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats1–3, little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of...
Article
Sponges and evolutionary origins Sponges represent our distant animal relatives. They do not have a nervous system but do have a simple body for filter feeding. Surveying the cell types in the freshwater sponge Spongilla lacustris , Musser et al . found that many genes important in synaptic communication are expressed in cells of the small digestiv...
Article
Full-text available
Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improve...
Article
Full-text available
Considering the enormous variety of LBDs at sensor proteins, an important question resides in establishing the forces that have driven their evolution and selection. We present here the first clear demonstration that environmental factors play an important role in the selection and evolution of LBDs.
Preprint
Full-text available
Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improve...
Preprint
Full-text available
Post-transcriptional regulation is essential for life, yet we are currently unable to investigate its role in complex microbiome samples. Here we discover that co-translational mRNA degradation, where the degradation machinery follows the last translating ribosome, is conserved across prokaryotes. By investigating 5′P mRNA decay intermediates, we o...
Article
Full-text available
The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficu...
Article
Full-text available
Microbial organisms inhabit virtually all environments and encompass a vast biological diversity. The pangenome concept aims to facilitate an understanding of diversity within defined phylogenetic groups. Hence, pangenomes are increasingly used to characterize the strain diversity of prokaryotic species. To understand the interdependence of pangeno...
Article
Full-text available
Ocean microbial communities strongly influence the biogeochemistry, food webs, and climate of our planet. Despite recent advances in understanding their taxonomic and genomic compositions, little is known about how their transcriptomes vary globally. Here, we present a dataset of 187 metatranscriptomes and 370 metagenomes from 126 globally distribu...
Article
Full-text available
Microbiology depends on the availability of annotated microbial genomes for many applications. Comparative genomics approaches have been a major advance, but consistent and accurate annotations of genomes can be hard to obtain. In addition, newer concepts such as the pan-genome concept are still being implemented to help answer biological questions...
Preprint
Full-text available
The evolutionary origin of metazoan cell types such as neurons, muscles, digestive, and immune cells, remains unsolved. Using whole-body single-cell RNA sequencing in a sponge, an animal without nervous system and musculature, we identify 18 distinct cell types comprising four major families. This includes nitric-oxide sensitive contractile cells,...
Preprint
Full-text available
Microbial organisms inhabit virtually all environments and encompass a vast biological diversity. The pan-genome concept aims to facilitate an understanding of diversity within defined phylogenetic groups. Hence, pan-genomes are increasingly used to characterize the strain diversity of prokaryotic species. To understand the interdependency of pan-g...
Article
Full-text available
Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs) and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree...
Article
Full-text available
Background: Shotgun metagenomes contain a sample of all the genomic material in an environment, allowing for the characterization of a microbial community. In order to understand these communities, bioinformatics methods are crucial. A common first step in processing metagenomes is to compute abundance estimates of different taxonomic or functiona...
Article
Full-text available
Metagenomic sequencing has greatly improved our ability to profile the composition of environmental and host-associated microbial communities. However, the dependency of most methods on reference genomes, which are currently unavailable for a substantial fraction of microbial species, introduces estimation biases. We present an updated and function...
Article
Full-text available
Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING da...
Article
Full-text available
eggNOG is a public database of orthology relationships, gene evolutionary histories and functional annotations. Here, we present version 5.0, featuring a major update of the underlying genome sets, which have been expanded to 4445 representative bacteria and 168 archaea derived from 25 038 genomes, as well as 477 eukaryotic organisms and 2502 viral...
Article
Full-text available
Soils harbour some of the most diverse microbiomes on Earth and are essential for both nutrient cycling and carbon storage. To understand soil functioning, it is necessary to model the global distribution patterns and functional gene repertoires of soil microorganisms, as well as the biotic and environmental associations between the diversity and s...
Preprint
Full-text available
NGLess is a domain specific language for describing next-generation sequence processing pipelines. It was developed with the goal of enabling user-friendly computational reproducibility. Using this framework, we developed NG-meta-profiler , a fast profiler for metagenomes which performs sequence preprocessing, mapping to bundled databases, filterin...
Article
Full-text available
Population genomics of prokaryotes has been studied in depth in only a small number of primarily pathogenic bacteria, as genome sequences of isolates of diverse origin are lacking for most species. Here, we conducted a large-scale survey of population structure in prevalent human gut microbial species, sampled from their natural environment, with a...
Article
Full-text available
The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have esta...
Article
Full-text available
Orthology assignment is ideally suited for functional inference. However, because predicting orthology is computationally intensive at large scale, and most pipelines are relatively inaccessible (e.g. new assignments only available through database updates), less precise homology-based functional transfer is still the default for (meta-)genome anno...
Article
Full-text available
Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Most empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretation...
Data
Supplementary material Supplementary table 1. Provided separately as Microsoft Excel file. Table caption: C-C bond forming and cleaving reactions, anchor reactions, (having reactants with defined atomic formulas and more than a single carbon in both cleavage products) extracted from KEGG database (rel. 78, Apr 1st, 2016) (Kanehisa and Goto, 2000, K...
Article
Full-text available
Microbial cell factories based on renewable carbon sources are fundamental to a sustainable bio-economy. The economic feasibility of producer cells requires robust performance balancing growth and production. However, the inherent competition between these two objectives often leads to instability and reduces productivity. While algorithms exist to...
Preprint
Full-text available
Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Virtually all empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect inter...
Article
Full-text available
The availability of microbial genomes has opened many new avenues of research within microbiology. This has been driven primarily by comparative genomics approaches, which rely on accurate and consistent characterization of genomic sequences. It is nevertheless difficult to obtain consistent taxonomic and integrated functional annotations for defin...
Preprint
Full-text available
Orthology assignment is ideally suited for functional inference. However, because predicting orthology is computationally intensive at large scale, and most pipelines relatively in accessible, less precise homology-based functional transfer is still the default for (meta-)genome annotation. We therefore developed eggNOG-mapper, a tool for functiona...
Article
Full-text available
Arthropods interact with humans at different levels with highly beneficial roles (e.g. as pollinators), as well as with a negative impact for example as vectors of human or animal diseases, or as agricultural pests. Several arthropod genomes are available at present and many others will be sequenced in the near future in the context of the i5K init...
Article
Persistence of fecal transplants Fecal microbiota transplantation is a successful way of treating the distressing symptoms of irritable bowel disease or Clostridium difficile infection. The procedure is done by administering a concentrate of colonic bacteria from a healthy donor. Li et al. used metagenomic data to look at single-nucleotide variants...
Article
Full-text available
MOCAT2 is a software pipeline for metagenomic sequence assembly and gene prediction with novel features for taxonomic and functional abundance profiling. The automated generation and efficient annotation of non-redundant reference catalogs by propagating pre-computed assignments from 18 databases covering various functional categories allows for fa...
Article
Full-text available
Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess t...
Article
Full-text available
The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common...
Article
Full-text available
eggNOG is a public resource that provides Orthologous Groups (OGs) of proteins at different taxonomic levels, each with integrated and summarized functional annotations. Developments since the latest public release include changes to the algorithm for creating OGs across taxonomic levels, making nested groups hierarchically consistent. This allows...
Article
Full-text available
Quest for Orthologs (QfO) is a community effort with the goal to improve and benchmark orthology predictions. As quality assessment assumes prior knowledge on species phylogenies, we investigated the congruency between existing species trees by comparing the relationships of 147 QfO reference organisms from six Tree of Life (ToL) / species tree pro...
Article
Full-text available
Horizontal gene transfer has emerged as a crucial driving force for the evolution of eukaryotes. This also includes Plasmodium falciparum and related economically and clinically relevant apicomplexan parasites, whose rather small genomes have been shaped not only by natural selection in different host populations but also by horizontal gene transfe...