Marcel J T Reinders

Marcel J T Reinders
Delft University of Technology | TU · Faculty of Electrical Engineering, Mathematics and Computer Sciences (EEMCS)

PhD

About

636
Publications
73,111
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,160
Citations
Introduction
Computer Scientist specialized in pattern recognition with applications in bioinformatics, computer vision and information retrieval
Additional affiliations
August 2011 - present
Netherlands Bioinformatics Centre
Netherlands Bioinformatics Centre
Position
  • Scientifc director
January 2005 - December 2012
Technische Universiteit Delft
Education
September 1990 - December 1995
Delft University of Technology
Field of study
  • Electrical Engineering
September 1984 - September 1990
Delft University of Technology
Field of study
  • Applied Physics

Publications

Publications (636)
Article
Full-text available
Population-scale expression profiling studies can provide valuable insights into biological and disease-underlying mechanisms. The availability of phenotypic traits is essential for studying clinical effects. Therefore, missing, incomplete, or inaccurate phenotypic information can make analyses challenging and prevent RNA-seq or other omics data to...
Preprint
Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health as well as disease. Such large-scale atlases not only increase the scale and generalizability of analyses but also enable combining the knowledge generated by individual studies. Specific...
Preprint
With age, somatic mutations accumulated in human brain cells can lead to various neurological disorders and brain tumors. Since the incidence rate of Alzheimer’s disease (AD) increases exponentially with age, investigating the association between AD and the accumulation of somatic mutation can help understand the etiology of AD. Here we built a som...
Preprint
There is an exponential increase in the number of cells measured in single-cell RNA sequencing (scRNAseq) datasets. Concurrently, scRNA-seq datasets become increasingly sparser as more zero counts are measured for many genes. We discuss that with increasing sparsity the binarized representation of gene expression becomes as informative as count-bas...
Article
The integration of metabolomics data with sequencing data is a key step towards improving the diagnostic process for finding the disease-causing genetic variant(s) in patients suspected of having an inborn error of metabolism (IEM). The measured metabolite levels could provide additional phenotypical evidence to elucidate the degree of pathogenicit...
Preprint
Motivation Single-cell technologies allow deep characterization of different molecular aspects of cells. Integrating these modalities provides a comprehensive view of cellular identity. Current integration methods rely on overlapping features or cells to link datasets measuring different modalities, limiting their application to experiments where d...
Article
Full-text available
The IMGT database profiles the TR germline alleles for all four TR loci (TRA, TRB, TRG and TRD), however, it does not comprise of the information regarding population specificity and allelic frequencies of these germline alleles. The specificity of allelic variants to different human populations can, however, be a rich source of information when st...
Article
Genome-wide association studies (GWAS) have been highly informative in discovering disease-associated loci but are not designed to capture all structural variations in the human genome. Using long-read sequencing data, we discovered widespread structural variation within SINE-VNTR-Alu (SVA) elements, a class of great ape-specific transposable eleme...
Preprint
Preclinical models are essential to cancer research, however, key biological differences with patient tumors result in reduced translatability to the clinic and high attrition rates in drug development. Variability among and between patients, preclinical models, and individual cells obscures commonalities which could otherwise be exploited therapeu...
Article
Motivation Single-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells. Results We propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with co-...
Article
Full-text available
Objective To facilitate patient disease subset and risk factor identification by constructing a pipeline which is generalizable, provides easily interpretable results, and allows replication by overcoming electronic health records (EHRs) batch effects. Material and Methods We used 1872 billing codes in EHRs of 102 880 patients from 12 healthcare s...
Preprint
Transcriptome-wide association studies (TWAS) can provide valuable insights into biological and disease-underlying mechanisms. For studying clinical effects, availability of (confounding) phenotypic traits is essential. The (re)use of RNA-seq or other omics data can be limited by missing, incomplete, or inaccurate phenotypic information. A possible...
Article
Full-text available
Several studies have analyzed gene expression profiles in the substantia nigra to better understand the pathological mechanisms causing Parkinson’s disease (PD). However, the concordance between the identified gene signatures in these individual studies was generally low. This might have been caused by a change in cell type composition as loss of d...
Article
Full-text available
In prenatal diagnostics, NIPT screening utilizing read coverage-based profiles obtained from shallow WGS data is routinely used to detect fetal CNVs. From this same data, fragment size distributions of fetal and maternal DNA fragments can be derived, which are known to be different, and often used to infer fetal fractions. We argue that the fragmen...
Article
Full-text available
Human longevity is influenced by the genetic risk of age-related diseases. As Alzheimer’s disease (AD) represents a common condition at old age, an interplay between genetic factors affecting AD and longevity is expected. We explored this interplay by studying the prevalence of AD-associated single-nucleotide-polymorphisms (SNPs) in cognitively hea...
Article
Full-text available
The maintenance of pancreatic islet architecture is crucial for proper β-cell function. We previously reported that disruption of human islet integrity could result in altered β-cell identity. Here we combine β-cell lineage tracing and single-cell transcriptomics to investigate the mechanisms underlying this process in primary human islet cells. Us...
Article
Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus spac...
Article
Full-text available
Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporat...
Article
Full-text available
To mount an adequate immune response against pathogens, stepwise mutation and selection processes are crucial functions of the adaptive immune system. To better characterize a successful vaccination response, we performed longitudinal (days 0, 5, 7, 10, and 14 after Boostrix vaccination) analysis of the single-cell transcriptome as well as the B-ce...
Article
Full-text available
A major challenge for treating pancreatic ductal adenocarcinoma (PDAC) patients is the unpredictability of their prognoses due to high heterogeneity. We present Multi-Omics DEep Learning for Prognosis-correlated subtyping (MODEL-P) to identify PDAC subtypes and to predict prognoses of new patients. MODEL-P was trained on autoencoder integrated mult...
Preprint
Full-text available
Adaptation of the immune system to mount an adequate immune response against pathogens is a crucial function of the adaptive immune system. To better characterize a successful vaccination response, we performed longitudinal (days 0, 5, 7, 10, and 14 after Boostrix vaccination) analysis of the single cell transcriptome as well as the B-cell receptor...
Article
Full-text available
Single-cell RNA sequencing data is characterized by a large number of zero counts, yet there is growing evidence that these zeros reflect biological variation rather than technical artifacts. We propose to use binarized expression profiles to identify the effects of biological variation in single-cell RNA sequencing data. Using 16 publicly availabl...
Article
Full-text available
Cortical atrophy is a common manifestation in Parkinson’s disease (PD), particularly in advanced stages of the disease. To elucidate the molecular underpinnings of cortical thickness changes in PD, we performed an integrated analysis of brain-wide healthy transcriptomic data from the Allen Human Brain Atlas and patterns of cortical thickness based...
Article
Full-text available
Background and Aims Protein profiling in patients with inflammatory bowel diseases (IBD) for diagnostic and therapeutic purposes is underexplored in IBD. This study analysed the association between phenotype, genotype and the plasma proteome in IBD. Methods Ninety-two (92) inflammation-related proteins were quantified in plasma of 1,028 patients w...
Article
Full-text available
Abstract Genetic factors play a major role in frontotemporal dementia (FTD). The majority of FTD cannot be genetically explained yet and it is likely that there are still FTD risk loci to be discovered. Common variants have been identified with genome-wide association studies (GWAS), but these studies have not systematically searched for rare varia...
Preprint
Full-text available
Studying cellular differentiation using single-cell RNA sequencing (scRNA-seq) rapidly expands our understanding of cellular development processes. Recently, RNA velocity has created new possibilities in studying these cellular differentiation processes, as differentiation dynamics can be obtained from measured spliced and unspliced mRNA expression...
Article
Full-text available
Background: Dementia with Lewy bodies (DLB) is a complex, progressive neurodegenerative disease with considerable phenotypic, pathological, and genetic heterogeneity. Objective: We tested if genetic variants in part explain the heterogeneity in DLB. Methods: We tested the effects of variants previously associated with DLB (near APOE, GBA, and...
Preprint
Full-text available
Missing or incomplete phenotypic information can severely deteriorate the statistical power in epidemiological studies. High-throughput quantification of small-molecules in bio-samples, i.e. metabolomics, is steadily gaining popularity, as it is highly informative for various phenotypical characteristics. Here we aim to leverage metabolomics to imp...
Article
Full-text available
Immunoglobulin (IG) loci harbor inter-individual allelic variants in many different germline IG variable, diversity and joining genes of the IG heavy (IGH), kappa (IGK) and lambda (IGL) loci, which together form the genetic basis of the highly diverse antigen-specific B-cell receptors. These allelic variants can be shared between or be specific to...
Article
Full-text available
Genetic association studies are frequently used to study the genetic basis of numerous human phenotypes. However, the rapid interrogation of how well a certain genomic region associates across traits as well as the interpretation of genetic associations is often complex and requires the integration of multiple sources of annotation, which involves...
Preprint
Full-text available
Several studies have analyzed gene expression profiles in the substantia nigra to better understand the pathological mechanisms causing Parkinson’s disease (PD). However, the concordance between the identified gene signatures in these individual studies was generally low. This might be caused by a change in cell type composition as loss of dopamine...
Preprint
Full-text available
The integration of metabolomics data with sequencing data is a key step towards improving the diagnostic process for finding the disease-causing gene(s) in patients suspected of having an inborn error of metabolism (IEM). The measured metabolite levels could provide additional phenotypical evidence to elucidate the degree of pathogenicity for varia...
Preprint
Full-text available
The global population is growing older. As age is a primary risk factor of (multi)morbidity, there is a need for novel indicators to predict, track, treat and prevent the development of disease. Lifestyle interventions have shown promising results in improving the health of participants and reducing the risk for disease, but in the elderly populati...
Article
Full-text available
Supervised methods are increasingly used to identify cell populations in single-cell data. Yet, current methods are limited in their ability to learn from multiple datasets simultaneously, are hampered by the annotation of datasets at different resolutions, and do not preserve annotations when retrained on new datasets. The latter point is especial...
Article
Full-text available
Controlled human infections provide opportunities to study the interaction between the immune system and malaria parasites, which is essential for vaccine development. Here, we compared immune signatures of malaria-naive Europeans and of Africans with lifelong malaria exposure using mass cytometry, RNA sequencing and data integration, before and 5...
Article
Full-text available
The genotype-phenotype link is a major research topic in the life sciences but remains highly complex to disentangle. Part of the complexity arises from the number of genes contributing to the observed phenotype. Despite the vast increase of molecular data, pinpointing the causal variant underlying a phenotype of interest is still challenging. In t...
Article
Full-text available
A Correction to this paper has been published: https://doi.org/10.1038/s41467-021-22613-2
Preprint
Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labelled protein training data. A recently published supervised molecular function predicting model partly circumvents this limitation by making its predictions based on the universal (i.e. task-agnost...
Article
Full-text available
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these...
Article
Full-text available
Structural covariance networks are able to identify functionally organized brain regions by gray matter volume covariance across a population. We examined the transcriptomic signature of such anatomical networks in the healthy brain using post‐mortem microarray data from the Allen Human Brain Atlas. A previous study revealed that a posterior cingul...
Preprint
Full-text available
Motivation Single-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells. Results We propose S ingle- C ell M ulti- o mics C lustering (scMoC), an approach to identify cell clusters from data w...
Preprint
Full-text available
T-cell receptor ( TR ) germline allele sequences are arranged, organized and made available to the research community by the IMGT database. This state-of-the-art database, however, does not provide information regarding population specificity and allelic frequencies of the four human TR loci ( TRA , TRB , TRG and TRD ). The specificity of allelic v...
Preprint
Full-text available
The genetics underlying human longevity is influenced by the genetic risk to develop -or escape- age-related diseases. As Alzheimer's disease (AD) represents one of the most common conditions at old age, an interplay between genetic factors for AD and longevity is expected. We explored this interplay by studying the prevalence of 38 AD-associated s...
Preprint
Full-text available
Single-cell RNA sequencing data is characterized by a large number of zero counts, yet there is growing evidence that these zeros reflect biological rather than technical artifacts. We propose differential dropout analysis (DDA), as an alternative to differential expression analysis (DEA), to identify the effects of biological variation in single-c...
Article
Motivation Single cell data measures multiple cellular markers at the single-cell level for thousands to millions of cells. Identification of distinct cell populations is a key step for further biological understanding, usually performed by clustering this data. Dimensionality reduction based clustering tools are either not scalable to large datase...
Article
Full-text available
Untargeted metabolomics is an emerging technology in the laboratory diagnosis of inborn errors of metabolism (IEM). Analysis of a large number of reference samples is crucial for correcting variations in metabolite concentrations that result from factors, such as diet, age, and gender in order to judge whether metabolite levels are abnormal. Howeve...
Preprint
Full-text available
T-cell receptor ( TR ) germline alleles are arranged, organized and made available to the research community by the IMGT database. This state-of-the-art database, however, does not provide information regarding population specificity and allelic frequencies of the genes all four human TR loci ( TRA, TRB, TRG and TRD ). The specificity of allelic va...
Preprint
Full-text available
The ever-increasing number of analyzed cells in Single-cell RNA sequencing (scRNA-seq) experiments imposes several challenges on the data analysis. Current analysis methods lack scalability to large datasets hampering interactive visual exploration of the data. We present Cytosplore-Transcriptomics, a framework to analyze scRNA-seq data, including...
Article
Full-text available
Background: In animal breeding, identification of causative genetic variants is of major importance and high economical value. Usually, the number of candidate variants exceeds the number of variants that can be validated. One way of prioritizing probable candidates is by evaluating their potential to have a deleterious effect, e.g. by predicting...
Article
Developing Alzheimer’s disease (AD) is influenced by multiple genetic variants that are involved in five major AD‐pathways: immune response, β‐amyloid metabolism, endocytosis, cholesterol/lipid metabolism and angiogenesis. The extent to which these pathways are involved in the resilience against AD have thus far been poorly addressed. Here, we inve...
Article
The majority of individuals with subjective cognitive decline (SCD) is worried well, but in some the subjective experience of cognitive decline herald’s incipient neurodegenerative disease. Understanding the determinants of disease in SCD is important to separate wheat from chaff. Here we studied APOE and a polygenic risk score for Alzheimer’s dise...
Article
Full-text available
Physical interaction between two proteins is strong evidence that the proteins are involved in the same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting the cellular functions of proteins. However, PPI networks are largely incomplete for non-model species. Here, we tested to what extent t...
Article
Full-text available
Studying the genome of centenarians may give insights into the molecular mechanisms underlying extreme human longevity and the escape of age-related diseases. Here, we set out to construct polygenic-risk-scores (PRS) for longevity and to investigate the functions of longevity-associated variants. Using a cohort of centenarians with maintained cogni...
Preprint
Full-text available
Genetic association studies are largely used to study the genetic basis of numerous traits. However, the interpretation of genetic associations is often complex and requires the integration of multiple sources of annotation. We developed snpXplorer , a web-server application for exploring SNP-association statistics across human traits and functiona...
Article
Full-text available
The human genome harbors numerous structural variants (SVs) which, due to their repetitive nature, are currently underexplored in short-read whole-genome sequencing approaches. Using single-molecule, real-time (SMRT) long-read sequencing technology in combination with FALCON-Unzip, we generated a de novo assembly of the diploid genome of a 115-year...
Article
Full-text available
The current rate at which new DNA and protein sequences are being generated is too fast to experimentally discover the functions of those sequences, emphasizing the need for accurate Automatic Function Prediction (AFP) methods. AFP has been an active and growing research field for decades and has made considerable progress in that time. However, it...