David Gomez-Cabrero

David Gomez-Cabrero
Navarrabiomed · Translational Bioinformatics

PhD

About

244
Publications
39,896
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,712
Citations
Additional affiliations
May 2017 - present
Navarrabiomed
Position
  • Head of Department
September 2016 - present
King's College London
Position
  • Professor (Associate)
September 2014 - August 2018
Karolinska Institutet
Position
  • Professor (Assistant)

Publications

Publications (244)
Preprint
Full-text available
Patient heterogeneity represents a significant challenge for both individual patient management and clinical trial design, especially in the context of complex diseases. Most existing clinical classifications are based on scores built to predict the outcomes of the patients. These classical methods may thus miss features that contribute to heteroge...
Article
Full-text available
The rise of single-cell genomics is an attractive opportunity for data-hungry machine learning algorithms. The scBERT method, inspired by the success of BERT (‘bidirectional encoder representations from transformers’) in natural language processing, was recently introduced by Yang et al. as a data-driven tool to annotate cell types in single-cell g...
Preprint
Full-text available
With the emergence of single-cell foundation models, an important question arises: how do these models perform when trained on datasets having an imbalance in cell type distribution due to rare cell types or biased sampling? We benchmark three foundation models, scGPT, scBERT, and Geneformer, using skewed single-cell cell-type distribution for cell...
Article
There is a need for tools that integrate single‐cell multi‐omic data while addressing several integrative challenges simultaneously. To this end, we designed a deep‐learning based tool LIBRA that performs competitively in both “integration” and “prediction” tasks based on single‐cell multi‐omics data. Furthermore, when assessing the predictive powe...
Article
Full-text available
Background Early detection has proven to be the most effective strategy to reduce the incidence and mortality of colorectal cancer (CRC). Nevertheless, most current screening programs suffer from low participation rates. A blood test may improve both the adherence to screening and the selection to colonoscopy. In this study, we conducted a serum-ba...
Preprint
Full-text available
Background: Human iPSCs' derivation and use in clinical studies are transforming medicine. Yet, there is a high cost and long waiting time for autologous iPS-based cellular therapy, and the genetic engineering of hypo-immunogenic iPS cell lines is hampered with numerous hurdles. Therefore, it is increasingly interesting to create cell stocks based...
Preprint
Full-text available
The early stages of the B-cell system are key for cellular immunity development, and alterations may lead to various disorders. Understanding the gene regulatory network (GRN) of this system is essential for studying healthy development and malignant transformations. To this end, we generated matched human data for chromatin accessibility and trans...
Preprint
Full-text available
In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI). Specifically, we investigate different setups to incorporate trainable features into a multi-layer encoder-decoder GSM formulation under frozen pre-trained settings. Our architecture inclu...
Preprint
Full-text available
Discovering non-linear dynamical models from data is at the core of science. Recent progress hinges upon sparse regression of observables using extensive libraries of candidate functions. However, it remains challenging to model hidden non-observable control variables governing switching between different dynamical regimes. Here we develop a data-e...
Article
Full-text available
The historical lack of preclinical models reflecting the genetic heterogeneity of multiple myeloma (MM) hampers the advance of therapeutic discoveries. To circumvent this limitation, we screened mice engineered to carry eight MM lesions (NF-κB, KRAS, MYC, TP53, BCL2, cyclin D1, MMSET/NSD2 and c-MAF) combinatorially activated in B lymphocytes follow...
Article
The diversity of microbial insertion sequences, crucial mobile genetic elements in generating diversity in microbial genomes, needs to be better represented in current microbial databases. Identification of these sequences in microbiome communities presents some significant problems that have led to their underrepresentation. Here, we present a bio...
Article
Full-text available
Recent progress in Single-Cell Genomics has produced different library protocols and techniques for molecular profiling. We formulate a unifying, data-driven, integrative, and predictive methodology for different libraries, samples, and paired-unpaired data modalities. Our design of scAEGAN includes an autoencoder (AE) network integrated with adver...
Article
Full-text available
Coronary Artery Fistulae (CAFs) are cardiac congenital anomalies consisting of an abnormal communication of a coronary artery with either a cardiac chamber or another cardiac vessel. In humans, these congenital anomalies can lead to complications such as myocardial hypertrophy, endocarditis, heart dilatation, and failure. Unfortunately, despite the...
Article
Full-text available
Early hematopoiesis is a continuous process in which hematopoietic stem and progenitor cells (HSPCs) gradually differentiate toward specific lineages. Aging and myeloid malignant transformation are characterized by changes in the composition and regulation of HSPCs. In this study, we used single-cell RNA sequencing (scRNA-seq) to characterize an en...
Article
Our laboratory has demonstrated that the NLRP3 inflammasome has a critical part in the microglial innate immune response to Alzheimer’s disease (AD)‐related peptides, triggering the production of cleaved‐caspase‐1 and IL‐1β. NLRP3 activation was found in post‐mortem tissue from individuals with AD (Heneka et al., 2013) and in transgenic mouse model...
Article
Full-text available
Circulating tumor cells (CTCs) are the key link between a primary tumor and distant metastases, but once in the bloodstream loss of adhesion induces cell death. To identify mechanisms relevant for melanoma CTC survival we performed RNAseq and discovered that detached melanoma cells and isolated melanoma CTCs rewire lipid metabolism by up-regulating...
Article
Full-text available
Background Multiple sclerosis (MS) is a chronic inflammatory neurodegenerative disease of the central nervous system (CNS) characterized by irreversible disability at later progressive stages. A growing body of evidence suggests that disease progression depends on age and inflammation within the CNS. We aimed to investigate epigenetic aging in bulk...
Article
Full-text available
Profiling of mRNA expression is an important method to identify biomarkers but complicated by limited correlations between mRNA expression and protein abundance. We hypothesised that these correlations could be improved by mathematical models based on measuring splice variants and time delay in protein translation. We characterised time-series of p...
Article
Full-text available
The Mediterranean diet (MedDiet) represents the traditional food consumption patterns of people living in countries bordering the Mediterranean Sea and is associated with a reduced incidence of obesity and type-2 diabetes mellitus (T2DM). The objective of this study was to examine differences in the composition of the oral microbiome in older adult...
Preprint
The diversity of microbial insertion sequences, crucial mobile genetic elements in generating diversity in microbial genomes, needs to be better represented in current microbial databases. Identification of these sequences in microbiome communities presents some significant problems that have led to their underrepresentation. Here, we present a sof...
Preprint
Full-text available
Palatine tonsils are secondary lymphoid organs representing the first line of immunological defense against inhaled or ingested pathogens. Here, we present a comprehensive census of cell types forming the human tonsil by applying single-cell transcriptome, epigenome, proteome and adaptive immune repertoire sequencing as well as spatial transcriptom...
Preprint
Full-text available
Recent progress in Single-Cell Genomics have produced different library protocols and techniques for profiling of one or more data modalities in individual cells. Machine learning methods have separately addressed specific integration challenges (libraries, samples, paired-unpaired data modalities). We formulate an unifying data-driven methodology...
Article
Full-text available
Understanding the regulation of normal and malignant human hematopoiesis requires comprehensive cell atlas of the hematopoietic stem cell (HSC) regulatory microenvironment. Here, we develop a tailored bioinformatic pipeline to integrate public and proprietary single-cell RNA sequencing (scRNA-seq) datasets. As a result, we robustly identify for the...
Preprint
Full-text available
Early hematopoiesis is a continuous process in which hematopoietic stem and progenitor cells (HSPCs) gradually differentiate toward specific lineages. Aging and myeloid malignant transformation are characterized by changes in the composition and regulation of HSPCs. In this study, we used single cell RNA sequencing (scRNAseq) to characterize an enr...
Article
Full-text available
Multiple Sclerosis (MS), the leading cause of non-traumatic neurological disability in young adults, is a chronic inflammatory and neurodegenerative disease of the central nervous system (CNS). Due to the poor accessibility to the target organ, CNS-confined processes underpinning the later progressive form of MS remain elusive thereby limiting trea...
Preprint
Aims In this work we investigated the embryonic origin of coronary arterio-ventricular connections, known as coronary artery fistulas (CAF), a congenital heart disease associated to postnatal and adult changes in systemic hemodynamics that may cause cardiac ischemia. Methods and results we have used different animal models (mouse and avian embryos...
Article
Full-text available
PurposeObjective markers of usual diet are of interest as alternative or validating tools in nutritional epidemiology research. The main purpose of the work was to assess whether saliva protein composition can reflect dietary habits in older adults, and how type 2 diabetes impacted on the saliva-diet correlates.Methods214 participants were selected...
Preprint
Full-text available
Background Early detection through screening programs has proven to be the most effective strategy to reduce the incidence and mortality of colorectal cancer. The most widely implemented non-invasive screening test is the fecal immunochemical test, which presents an inadequate sensitivity for the detection of precancerous advanced adenomas. This fa...
Article
Background: The putative involvement of chromatin states in multiple sclerosis (MS) is thus far unclear. Here we determined the association of chromatin-accessibility with concurrent genetic, epigenetic and transcriptional events. Material & methods: We generated paired assay for transposase-accessible chromatin sequencing and RNA-seq profiles from...
Preprint
Full-text available
Understanding the regulation of normal and malignant human hematopoiesis requires comprehensive cell atlas of the hematopoietic stem cell (HSC) regulatory microenvironment. Here, we develop a tailored bioinformatic pipeline to integrate public and proprietary single-cell RNA sequencing (scRNA-seq) datasets. As a result, we robustly identify for the...
Preprint
Full-text available
Background Profiling of mRNA expression is an important method to identify biomarkers but complicated by limited correlations between mRNA expression and protein abundance. We hypothesized that these correlations could be improved by mathematical models based on measuring splice variants and time delay in protein translation. Methods We characteriz...
Preprint
Full-text available
PARAGRAPH Fanconi anemia (FA) is a monogenic inherited disease associated with mutations in genes that encode for proteins participating in the FA/BRCA DNA repair pathway. Mutations in FA genes result in chromosomal instability and cell death, leading to cancer risks and progressive cell mortality, most notably in hematopoietic stem and progenitor...
Preprint
Objectives Production of anti-citrullinated peptide/protein antibodies (ACPA) is characteristic for rheumatoid arthritis (RA) and may inform about biological pathways involved in disease development in specific subgroups. Since multiple loci in genome wide association screens have been implicated in RA risk, we investigated the association between...
Article
Full-text available
GM-CSF produced by autoreactive CD4-positive T helper cells is involved in the pathogenesis of autoimmune diseases, such as multiple sclerosis. However, the molecular regulators that establish and maintain the features of GM-CSF-positive CD4 T cells are unknown. In order to identify these regulators, we isolated human GM-CSF-producing CD4 T cells f...
Preprint
Full-text available
Background Multiple Sclerosis (MS), the leading cause of non-traumatic neurological disability in young adults, is a chronic inflammatory and neurodegenerative disease of the central nervous system (CNS). Due to the poor accessibility to the target organ, CNS-confined processes underpinning the later progressive form of MS remain elusive thereby li...
Article
Full-text available
During the last decade, extensive efforts have been made to comprehend cardiac cell genetic and functional diversity. Such knowledge allows for the definition of the cardiac cellular interactome as a reasonable strategy to increase our understanding of the normal and pathologic heart. Previous experimental approaches including cell lineage tracing,...
Article
Full-text available
Background While programmed cell death receptor 1 (PD-1) blockade treatment has revolutionized treatment of patients with melanoma, clinical outcomes are highly variable, and only a fraction of patients show durable responses. Therefore, there is a clear need for predictive biomarkers to select patients who will benefit from the treatment. Method...
Preprint
Full-text available
The role of gut microbiota in humans is of great interest, and metagenomics provided the possibilities for extensively analysing bacterial diversity in health and disease. Here we explored the human gut microbiome samples across 19 countries, performing compositional, functional and integrative analysis. To complement these data and analyse the sta...
Article
Full-text available
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decisi...
Article
Phenotype-specific omic expression patterns in people with frailty could provide invaluable insight into the underlying multi-systemic pathological processes and targets for intervention. Classical approaches to frailty have not considered the potential for different frailty phenotypes. We characterized associations between frailty (with/without di...
Article
Phenotype-specific omic expression patterns in people with frailty could provide invaluable insight into the underlying multi-systemic pathological processes and targets for intervention. Classical approaches to frailty have not considered the potential for different frailty phenotypes. We characterized associations between frailty (with/without di...
Preprint
Full-text available
Background Single-cell multi-omics technologies allow the profiling of different data modalities from the same cell. However, while isolated modalities only capture one view of the total information of a biological cell, an integrative analysis capturing the different modalities is challenging. In response, bioinformatics and machine learning metho...
Preprint
Full-text available
Recent progress in single-cell genomics has generated multiple tools for cell clustering, annotation, and trajectory inference; yet, inferring their associated regulatory mechanisms is unresolved. Here we present scMomentum, a model-based data-driven formulation to predict gene regulatory networks and energy landscapes from single-cell transcriptom...
Preprint
Full-text available
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decisi...
Article
Full-text available
Background: Gene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or micr...
Article
Full-text available
Background: Cardiac fibroblasts (CFs) have a central role in the ventricular remodeling process associated with different types of fibrosis. Recent studies have shown that fibroblasts do not respond homogeneously to heart injury. Because of the limited set of bona fide fibroblast markers, a proper characterization of fibroblast population heteroge...
Article
Background: There is growing evidence that the Mediterranean (Medi) diet may lower the risk of type 2 diabetes mellitus (T2DM). Whether this association is due to the Medi diet by itself or is mediated by a diet -associated lower rate of overweight is uncertain. Our aim was to disentangle these relationships among UK adults. Methods: Based on 21...
Preprint
Full-text available
Until recently, the contribution of bacteriophages to the composition and function of the human microbiome has been largely overlooked. Recent developments in discovering novel bacteriophages from human metagenomes have been mostly focused on the gut. Here we profile and compare the phageome of 633 human oral sites and 221 paired gut phageomes acqu...
Article
Full-text available
Multi-omic studies combine measurements at different molecular levels to build comprehensive models of cellular systems. The success of a multi-omic data analysis strategy depends largely on the adoption of adequate experimental designs, and on the quality of the measurements provided by the different omic platforms. However, the field lacks a comp...
Article
Patients with congenital adrenal hyperplasia (CAH) are at risk of long-term cognitive and metabolic sequelae with some of the effects being attributed to the chronic glucocorticoid treatment that they receive. Our pilot study investigates genome-wide DNA methylation in patients with CAH to determine whether there is preliminary evidence for epigeno...
Article
Background & aims: Acute-on-chronic liver failure (ACLF) is a newly described syndrome, which develops in patients with acute decompensation of cirrhosis, and is characterized by intense systemic inflammation, multiple organ failures and high short-term mortality. The profile of circulating lipid mediators, which are endogenous signaling molecules...
Article
Full-text available
Peripheral arterial disease (PAD) is associated with a high risk of cardiovascular events and death and is postulated to be a critical socioeconomic cost in the future. Extracellular vesicles (EVs) have emerged as potential candidates for new biomarker discovery related to their protein and nucleic acid cargo. In search of new prognostic and therap...
Article
Full-text available
The global threat of antimicrobial resistance has driven the use of high-throughput sequencing techniques to monitor the profile of resistance genes, known as the resistome, in microbial populations. The human oral cavity contains a poorly explored reservoir of these genes. Here we analyse and compare the resistome profiles of 788 oral cavities wor...
Article
Full-text available
Dysregulation of the kynurenine pathway has been regarded as a mechanism of tumor immune escape by the enzymatic activity of indoleamine 2, 3 dioxygenase and kynurenine production. However, the immune-modulatory properties of other kynurenine metabolites such as kynurenic acid, 3-hydroxykynurenine, and anthranilic acid are poorly understood. In thi...
Poster
Full-text available
Background/Purpose: Big data are defined as data sets that are too large or complex for traditional data-processing application software to adequately deal with. Artifcial Intelligence (AI) includes various statistical techniques which can deal with big data. The current use of these concepts in publications related to rheumatic and musculoskeletal...
Article
Full-text available
Rapid advances in single-cell assays have outpaced methods for analysis of those data types. Different single-cell assays show extensive variation in sensitivity and signal to noise levels. In particular, scATAC-seq generates extremely sparse and noisy datasets. Existing methods developed to analyze this data require cells amenable to pseudo-time a...
Article
Full-text available
Multi-omics approaches use a diversity of high-throughput technologies to profile the different molecular layers of living cells. Ideally, the integration of this information should result in comprehensive systems models of cellular physiology and regulation. However, most multi-omics projects still include a limited number of molecular assays and...
Preprint
Full-text available
Background: Patients with congenital adrenal hyperplasia (CAH) are at risk of long-term cognitive and metabolic sequelae. This study investigates genome-wide DNA methylation in patients with CAH to determine whether there is evidence for epigenomic reprogramming as well as any relationship to patient outcome. Methods: We analysed CD4+ T cell DNA fr...
Conference Paper
Full-text available
Background/Purpose: Big data are defined as data sets that are too large or complex for traditional data-processing application software to adequately deal with. Artificial Intelligence (AI) includes various statistical techniques which can deal with big data. The current use of these concepts in publications related to rheumatic and musculoskeleta...
Article
We analyzed genetic data of 47,429 multiple sclerosis (MS) and 68,374 control subjects and established a reference map of the genetic architecture of MS that includes 200 autosomal susceptibility variants outside the major histocompatibility complex (MHC), one chromosome X variant, and 32 variants within the extended MHC. We used an ensemble of met...
Article
Full-text available
Background: A poor fat-soluble micronutrient (FMN) and a high oxidative stress status are associated with frailty. Our aim was to determine the cross-sectional association of FMNs and oxidative stress biomarkers [protein carbonyls (PrCarb) and 3-nitrotyrosine] with the frailty status in participants older than 65 years. Methods: Plasma levels of...
Article
Full-text available
Multiple Sclerosis (MS) is an autoimmune disease of the central nervous system with prominent neurodegenerative components. The triggering and progression of MS is associated with transcriptional and epigenetic alterations in several tissues, including peripheral blood. The combined influence of transcriptional and epigenetic changes associated wit...