Salvador Capella-Gutierrez

Salvador Capella-Gutierrez
Barcelona Supercomputing Center · Department of Life Sciences

PhD in Bioinformatics

About

99
Publications
41,421
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,632
Citations
Citations since 2016
61 Research Items
10006 Citations
201620172018201920202021202205001,0001,5002,000
201620172018201920202021202205001,0001,5002,000
201620172018201920202021202205001,0001,5002,000
201620172018201920202021202205001,0001,5002,000
Additional affiliations
January 2013 - February 2016
Centre for Genomic Regulation
Position
  • PostDoc Position

Publications

Publications (99)
Article
Single-cell omics (SCO) has revolutionized the way and the level of resolution by which life science research is conducted, not only impacting our understanding of fundamental cell biology but also providing novel solutions in cutting-edge medical research. The rapid development of single-cell technologies has been accompanied by the active develop...
Article
Full-text available
The Orthology Benchmark Service (https://orthology.benchmarkservice.org) is the gold standard for orthology inference evaluation, supported and maintained by the Quest for Orthologs consortium. It is an essential resource to compare existing and new methods of orthology inference (the bedrock for many comparative genomics and phylogenetic analysis)...
Preprint
Full-text available
Software plays a crucial and growing role in research. Unfortunately, the computational component in Life Sciences research is challenging to reproduce and verify most of the time. It could be undocumented, opaque, may even contain unknown errors that affect the outcome, or be directly unavailable, and impossible to use by others. These issues are...
Article
Next Generation Sequencing technologies significantly impact the field of Antimicrobial Resistance (AMR) detection and monitoring, with immediate uses in diagnosis and risk assessment. For this application and in general, considerable challenges remain in demonstrating sufficient trust to act upon the meaningful information produced from raw data,...
Article
Full-text available
COVID-19 is an infectious disease caused by the SARS-CoV-2 virus, which has spread all over the world leading to a global pandemic. The fast progression of COVID-19 has been mainly related to the high contagion rate of the virus and the worldwide mobility of humans. In the absence of pharmacological therapies, governments from different countries h...
Article
Full-text available
PhylomeDB is a unique knowledge base providing public access to minable and browsable catalogues of pre-computed genome-wide collections of annotated sequences, alignments and phylogenies (i.e. phylomes) of homologous genes, as well as to their corresponding phylogeny-based orthology and paralogy relationships. In addition, PhylomeDB trees and alig...
Article
Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have return...
Preprint
Full-text available
COVID-19 is an infectious disease caused by the SARS-CoV-2 virus, which has spread all over the world leading to a global pandemic. The fast progression of COVID-19 has been mainly related to the high contagion rate of the virus and the worldwide mobility of humans. In the absence of pharmacological therapies, governments from different countries h...
Article
DOME is a set of community-wide recommendations for reporting supervised machine learning–based analyses applied to biological studies. Broad adoption of these recommendations will help improve machine learning assessment and reproducibility.
Article
Full-text available
Background: Many types of data from genomic analyses can be represented as genomic tracks, i.e. features linked to the genomic coordinates of a reference genome. Examples of such data are epigenetic DNA methylation data, ChIP-seq peaks, germline or somatic DNA variants, as well as RNA-seq expression levels. Researchers often face difficulties in lo...
Article
Full-text available
eTRANSAFE is a research project funded within the Innovative Medicines Initiative (IMI), which aims at developing integrated databases and computational tools (the eTRANSAFE ToxHub) that support the translational safety assessment of new drugs by using legacy data provided by the pharmaceutical companies that participate in the project. The project...
Preprint
Full-text available
Next Generation Sequencing technologies significantly impact the field of Antimicrobial Resistance (AMR) detection and monitoring, with immediate uses in diagnosis and risk assessment. For this application and in general, considerable challenges remain in demonstrating sufficient trust to act upon the meaningful information produced from raw data,...
Article
Full-text available
Rett syndrome (RTT) is a rare neurological disorder mostly caused by a genetic variation in MECP2 . Making new MECP2 variants and the related phenotypes available provides data for better understanding of disease mechanisms and faster identification of variants for diagnosis. This is, however, currently hampered by the lack of interoperability betw...
Article
Full-text available
Next Generation Sequencing technologies significantly impact the field of Antimicrobial Resistance (AMR) detection and monitoring, with immediate uses in diagnosis and risk assessment. For this application and in general, considerable challenges remain in demonstrating sufficient trust to act upon the meaningful information produced from raw data,...
Article
Purpose: Recurrent and/or metastatic unresectable cutaneous squamous cell carcinomas (cSCCs) are treated with chemotherapy or radiotherapy but have poor clinical responses. A limited response (up to 45% of cases) to EGFR-targeted therapies was observed in clinical trials with advanced and metastatic cSCC patients. Here, we analyze the molecular tr...
Article
Full-text available
Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While “High-Throughput” sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim...
Article
Full-text available
Comorbidity is a medical condition attracting increasing attention in healthcare and biomedical research. Little is known about the involvement of potential molecular factors leading to the emergence of a specific disease in patients affected by other conditions. We present here a disease interaction network inferred from similarities between patie...
Article
Full-text available
The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficu...
Article
Full-text available
Abstract Bioinformaticians and biologists rely increasingly upon workflows for the flexible utilization of the many life science tools that are needed to optimally convert data into knowledge. We outline a pan-European enterprise to provide a catalogue (https://bio.tools) of tools and databases that can be used in these workflows. bio.tools not onl...
Article
Full-text available
Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) are now recognised as major determinants in cellular regulation. This white paper presents a roadmap for future e-infrastructure developments in the field of IDP research within the ELIXIR framework. The goal of these developments is to drive the creation of high-q...
Article
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Article
Full-text available
Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition...
Preprint
Full-text available
Background: Transcriptomics data, often referred as RNA-Seq, are increasingly being adopted in clinical practice due to the opportunity to answer several questions with the same data -e.g. gene expression, splicing, allele-specific expression even without matching DNA. Indeed, recent studies showed how RNA-Seq can contribute to decipher the impact...
Poster
Full-text available
The Spanish National Bioinformatics Institute (INB) is the ELIXIR Node in Spain (ELIXIR-ES). The INB was founded in 2003 as a distributed network of nodes with a central coordination hub. Since its renewal process (2018 - 2020), the INB/ELIXIR-ES has increased the participant nodes to 19 research groups distributed across 13 institutions in Spain,...
Preprint
Full-text available
Comorbidity is an impactful medical problem that is attracting increasing attention in healthcare and biomedical research. However, little is known about the molecular processes leading to the development of a specific disease in patients affected by other conditions. We present a disease interaction network inferred from similarities in patients'...
Article
Full-text available
As a part of the ELIXIR-EXCELERATE efforts in capacity building, we present here 10 steps to facilitate researchers getting started in genome assembly and genome annotation. The guidelines given are broadly applicable, intended to be stable over time, and cover all aspects from start to finish of a general assembly and annotation project. Intrinsic...
Chapter
Phaseolus vulgaris is the most important legume species for human nourishment. However, until very recently genomics resources for this plant have been scarce, which preventing fully understanding the parallel domestications occurred at two geographical regions: Mesoamerica and Andes. The first reference genome for P. vulgaris, the Andean landrace...
Article
Full-text available
The dependence of life scientists on software has steadily grown in recent years. For many tasks, researchers have to decide which of the available bioinformatics software are more suitable for their specific needs. Additionally researchers should be able to objectively select the software that provides the highest accuracy, the best efficiency and...
Preprint
Full-text available
The dependence of life scientists on software has steadily grown in recent years. For many tasks, researchers have to decide which of the available bioinformatics software are more suitable for their specific needs. Additionally researchers should be able to objectively select the software that provides the highest accuracy, the best efficiency and...
Article
Full-text available
The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have esta...
Article
Full-text available
A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologi...
Article
Full-text available
Scientific research relies on computer software, yet software is not always developed following practices that ensure its quality and sustainability. This manuscript does not aim to propose new software development best practices, but rather to provide simple recommendations that encourage the adoption of existing best practices. Software developme...
Article
Full-text available
A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologi...
Article
Full-text available
Background Modern civilization depends on only a few plant species for its nourishment. These crops were derived via several thousands of years of human selection that transformed wild ancestors into high-yielding domesticated descendants. Among cultivated plants, common bean (Phaseolus vulgaris L.) is the most important grain legume. Yet, our unde...
Article
Full-text available
Abstract Background: Genomic studies of endangered species provide insights into their evolution and demographic history, reveal patterns of genomic erosion that might limit their viability, and offer tools for their effective conservation. The Iberian lynx (Lynx pardinus) is the most endangered felid and a unique example of a species on the brink...
Article
Cell-specific regulation of protein levels and activity is essential for the distribution of functions among multiple cell types in animals. The finding that many genes involved in these regulatory processes have a premetazoan origin raises the intriguing possibility that the mechanisms required for spatially regulated cell differentiation evolved...
Article
Full-text available
Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess t...
Article
Full-text available
Background: Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. Results: We report the genome and the transcription atlas of coding...
Conference Paper
The genus Fusarium includes more than 200 species of which 74 have been isolated from human infections causing a broad spectrum of opportunistic infections. The most pathogenic and multi-drug resistant among them is Fusarium solani species complex (FSSC). Superficial infections, such as keratitis and onychomycosis, are fre- quently manifested in im...
Article
Full-text available
Selenoproteins are proteins that incorporate selenocysteine (Sec), a non-standard amino acid that is encoded by UGA, normally a stop codon. The synthesis of Sec requires the enzyme Selenophosphate synthetase (SPS or SelD), conserved in all prokaryotic and eukaryotic genomes encoding selenoproteins. Here we study the evolutionary history of SPS gene...
Article
Full-text available
Here, we report the draft genome sequence of Solanum commersonii, which consists of ;830 megabases with an N50 of 44,303 bp anchored to 12 chromosomes, using the potato (Solanum tuberosum) genome sequence as a reference. Compared with potato, S. commersonii shows a striking reduction in heterozygosity (1.5% versus 53 to 59%), and differences in gen...
Article
Full-text available
Here, we report the draft genome sequence of Solanum commersonii, which consists of ∼830 megabases with an N50 of 44,303 bp anchored to 12 chromosomes, using the potato (Solanum tuberosum) genome sequence as a reference. Compared with potato, S. commersonii shows a striking reduction in heterozygosity (1.5% versus 53 to 59%), and differences in gen...
Preprint
Full-text available
SPS catalyzes the synthesis of selenophosphate, the selenium donor for the synthesis of the amino acid selenocysteine (Sec), incorporated in selenoproteins in response to the UGA codon. SPS is unique among proteins of the selenoprotein biosynthesis machinery in that it is, in many species, a selenoprotein itself, although, as in all selenoproteins,...
Article
Full-text available
To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rat...
Article
Full-text available
To better determine the history of modern birds, we performed a genome-scale phylogenetic analysis of 48 species representing all orders of Neoaves using phylogenomic methods created to handle genome-scale data. We recovered a highly resolved tree that confirms previously controversial sister or close relationships. We identified the first divergen...