About
117
Publications
21,014
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,609
Citations
Introduction
Additional affiliations
March 2018 - August 2018
October 2015 - February 2018
Publications
Publications (117)
Funding Acknowledgements
Type of funding sources: Foundation. Main funding source(s): German Center for Cardiovascular Research (DZHK)
Background/Introduction
RPs are young, hyper-reactive and RNA-rich platelets. They have a pro-thrombotic potential, are predictors of an insufficient response to antiplatelet therapy after myocardial infarction and...
Funding Acknowledgements
Type of funding sources: Public grant(s) – National budget only. Main funding source(s): This work was supported by the German Center for Cardiovascular
Research (DZHK grant number Deutsches Zentrum für Herz-
Kreislaufforschung 81 × 3600606 to D.B.).
Abstract:
Background/Introduction
Reticulated platelets (RPs) are prothro...
Cancer is a heterogeneous disease characterized by unregulated cell growth and promoted by mutations in cancer driver genes some of which encode suitable drug targets. Since the distinct set of cancer driver genes can vary between and within cancer types, evidence-based selection of drugs is crucial for targeted therapy following the precision medi...
Molecular signatures have been suggested as biomarkers to classify pancreatic ductal adenocarcinoma (PDAC) into two, three or four subtypes. Since the robustness of existing signatures is controversial, we performed a systematic evaluation of three established signatures for PDAC stratification across eight publicly available datasets. Clustering r...
Motivation
As complex tissues are typically composed of various cell types, deconvolution tools have been developed to computationally infer their cellular composition from bulk RNA sequencing (RNA-seq) data. To comprehensively assess deconvolution performance, gold-standard datasets are indispensable. Gold-standard, experimental techniques like fl...
During disease progression or organism development, alternative splicing (AS) may lead to isoform switches (IS) that demonstrate similar temporal patterns and reflect the AS co-regulation of such genes. Tools for dynamic process analysis usually neglect AS. Here we propose Spycone ( https://github.com/yollct/spycone ), a splicing-aware framework fo...
Motivation
A key problem in systems biology is the discovery of regulatory mechanisms that drive phenotypic behavior of complex biological systems in the form of multi-level networks. Modern multiomics profiling techniques probe these fundamental regulatory networks but are often hampered by experimental restrictions leading to missing data or part...
Motivation
Cancer is one of the leading causes of death worldwide. Despite significant improvements in prevention and treatment, mortality remains high for many cancer types. Hence, innovative methods that use molecular data to stratify patients and identify biomarkers are needed. Promising biomarkers can also be inferred from competing endogenous...
De novo pathway enrichment is a systems biology approach in which OMICS data are projected onto a molecular interaction network to identify subnetworks representing condition-specific functional modules and molecular pathways. Compared to classical pathway enrichment analysis methods, de novo pathway enrichment is not limited to predefined lists of...
Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distr...
Background
Exclusive enteral nutrition (EEN) is a first-line induction therapy for paediatric Crohn’s disease (CD). Although the protective mechanisms remain unclear, previous studies showed substantial changes in microbiome composition in response to EEN. We aim to assess the protective function of EEN and its impact on gut microbiome signatures i...
Background Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g., in the interpretation of next-generation sequencing data and in the design of clinical decision support systems.
Objectives However, training an AI model on sensitive data raises conc...
Alternative splicing is a major contributor to transcriptome and proteome diversity in health and disease. A plethora of tools have been developed for studying alternative splicing in RNA-seq data. Previous benchmarks focused on isoform quantification and mapping. They neglected event detection tools, which arguably provide the most detailed insigh...
Motivation
Disease module mining methods (DMMMs) extract subgraphs that constitute candidate disease mechanisms from molecular interaction networks such as protein-protein interaction (PPI) networks. Irrespective of the employed models, DMMMs typically include non-robust steps in their workflows, i. e., the computed subnetworks vary when running th...
Mass cytometry (CyTOF) is a new technology that allows the investigation of protein expression at single cell level with high resolution. While several protocols are available to investigate leukocyte expression, platelet staining and analysis with CyTOF have been described only from whole blood. Moreover, available protocols do not allow sample st...
Background
16S rRNA gene profiling is currently the most widely used technique in microbiome research and allows for studying microbial diversity, taxonomic profiling, phylogenetics, functional and network analysis. While a plethora of tools have been developed for the analysis of 16S rRNA gene data, only few platforms offer a user-friendly interfa...
Aggregating transcriptomics data across hospitals can increase sensitivity and robustness of differential expression analyses, yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, the accuracy might drop if class labels are inhomogeneous...
Alternative splicing (AS) is an important aspect of gene regulation. Nevertheless, its role in molecular processes and pathobiology is far from understood. A roadblock is that tools for the functional analysis of AS-set events are lacking. To mitigate this, we developed NEASE, a tool integrating pathways with structural annotations of protein-prote...
Despite impressive efforts invested in epigenetic research in the last 50 years, clinical applications are still lacking. Only a few university hospital centers currently use epigenetic biomarkers at the bedside. Moreover, the overall concept of precision medicine is not widely recognized in routine medical practice and the reductionist approach re...
For complex diseases, most drugs are highly ineffective, and the success rate of drug discovery is in constant decline. While low quality, reproducibility issues, and translational irrelevance of most basic and preclinical research have contributed to this, the current organ-centricity of medicine and the ‘one disease–one target–one drug’ dogma obs...
Cytometry techniques are widely used to discover cellular characteristics at single-cell resolution. Many data analysis methods for cytometry data focus solely on identifying subpopulations via clustering and testing for differential cell abundance. For differential expression analysis of markers between conditions, only few tools exist. These tool...
Background
Reticulated platelets (RPs) are young, hyper-reactive thrombocytes that contain more RNA compared with mature platelets (MPs). The measurement of RPs level in peripheral blood with point-of-care systems is fast, reproducible, and inexpensive. Elevated RPs in peripheral blood predict adverse events in patients with acute and chronic coron...
We present the AIMe registry, a community-driven reporting platform for AI in biomedicine. It aims to enhance the accessibility, reproducibility and usability of biomedical AI models, and allows future revisions by the community.
View-only version: https://rdcu.be/cv5H7
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection induces a coagulopathy characterized by platelet activation and a hypercoagulable state with an increased incidence of cardiovascular events. The viral spike protein S has been reported to enhance thrombosis formation, stimulate platelets to release procoagulant factors, and pro...
Cytometry techniques are widely used to discover cellular characteristics at single-cell resolution. Many data analysis methods for cytometry data focus solely on identifying subpopulations via clustering and testing for differential cell abundance. For differential expression analysis of markers between conditions, only few tools exist. These tool...
Alternative splicing (AS) is an important aspect of gene regulation. Nevertheless, its role in molecular processes and pathobiology is far from understood. A roadblock is that tools for the functional analysis of AS-set events are lacking. To mitigate this, we developed NEASE, a tool integrating pathways with protein-protein and domain-domain inter...
Lack of reproducibility in gene expression studies is a serious issue being actively addressed by the biomedical research community. Besides established factors such as batch effects and incorrect sample annotations, we recently reported tissue heterogeneity, a consequence of unintended profiling of cells of other origins than the tissue of interes...
SARS-CoV-2 infection induces a coagulopathy characterized by platelet activation and a hypercoagulable state with an increased incidence of cardiovascular events. The viral spike protein S has been reported to enhance thrombosis formation, stimulate platelets to release pro-coagulant factors and promote the formation of platelet-leukocyte aggregate...
Machine Learning (ML) and Artificial Intelligence (AI) have shown promising results in many areas and are driven by the increasing amount of available data. However, this data is often distributed across different institutions and cannot be shared due to privacy concerns. Privacy-preserving methods, such as Federated Learning (FL), allow for traini...
Microorganisms including bacteria, fungi, viruses, protists and archaea live as communities in complex and contiguous environments. They engage in numerous inter- and intra- kingdom interactions which can be inferred from microbiome profiling data. In particular, network-based approaches have proven helpful in deciphering complex microbial interact...
In network and systems medicine, active module identification methods (AMIMs) are widely used for discovering candidate molecular disease mechanisms. To this end, AMIMs combine network analysis algorithms with molecular profiling data, most commonly, by projecting gene expression data onto generic protein–protein interaction (PPI) networks. Althoug...
Epigenetics studies inheritable and reversible modifications of DNA that allow cells to control gene expression throughout their development and in response to environmental conditions. In computational epigenomics, machine learning is applied to study various epigenetic mechanisms genome wide. Its aim is to expand our understanding of cell differe...
Background: Transcriptional regulation of gene expression is crucial for the adaptation and survival of bacteria. Regulatory interactions are commonly modeled as Gene Regulatory Networks (GRNs) derived from experiments such as RNA-seq, microarray and ChIP-seq. While the reconstruction of GRNs is fundamental to decipher cellular function, even GRNs...
A plethora of tools exist for RNA-Seq data analysis with a focus on alternative splicing (AS). However, appropriate data for their comparative evaluation is missing. The R package ASimulatoR simulates gold standard RNA-Seq datasets with fine-grained control over the distribution of AS events, which allow for evaluating alternative splicing tools, e...
Short-amplicon 16S rRNA gene sequencing is currently the method of choice for studies investigating microbiomes. However, comparative studies on differences in procedures are scarce. We sequenced human stool samples and mock communities with increasing complexity using a variety of commonly used protocols. Short amplicons targeting different variab...
microRNAs (miRNAs) are post-transcriptional regulators involved in many biological processes and human diseases, including cancer. The majority of transcripts compete over a limited pool of miRNAs, giving rise to a complex network of competing endoge-nous RNA (ceRNA) interactions. Currently, gene-regulatory networks focus mostly on transcription fa...
microRNAs (miRNAs) are post-transcriptional regulators involved in many biological processes and human diseases, including cancer. The majority of transcripts compete over a limited pool of miRNAs, giving rise to a complex network of competing endogenous RNA (ceRNA) interactions. Currently, gene-regulatory networks focus mostly on transcription fac...
Novel coronavirus disease 2019 (COVID-19) is associated with a hypercoagulable state, characterized by abnormal coagulation parameters and by increased incidence of cardiovascular complications. With this study, we aimed to investigate the activation state and the expression of transmembrane proteins in platelets of hospitalized COVID-19 patients....
Motivation
Unsupervised learning approaches are frequently employed to stratify patients into clinically relevant subgroups and to identify biomarkers such as disease-associated genes. However, clustering and biclustering techniques are oblivious to the functional relationship of genes and are thus not ideally suited to pinpoint molecular mechanism...
Background:
Lack of reproducibility in gene expression studies has recently attracted much attention in and beyond the biomedical research community. Previous efforts have identified many underlying factors, such as batch effects and incorrect sample annotations. Recently, tissue heterogeneity, a consequence of unintended profiling of cells of othe...
Motivation
Recently, various tools for detecting single nucleotide polymorphisms (SNPs) involved in epistasis have been developed. However, no studies evaluate the employed statistical epistasis models such as the χ2-test or quadratic regression independently of the tools that use them. Such an independent evaluation is crucial for developing impro...
Federated learning is a well-established approach to privacy-preserving training of a joint model on heavily distributed data. Federated averaging (FedAvg) is a well-known communication-efficient algorithm for federated learning, which performs well if the data distribution across the clients is independently and identically distributed (IID). Howe...
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need...
Aggregating clinical transcriptomics data across hospitals can increase sensitivity and robustness of differential gene expression analyses yielding deeper clinical insights. As data exchange is often restricted by privacy legislation, meta-analyses are frequently employed to pool local results. However, if class labels or confounders are inhomogen...
The field of breath analysis lacks a fully automated analysis platform that enforces machine learning good practice and enables clinicians and clinical researchers to rapidly and reproducibly discover metabolite patterns in diseases. We present BALSAM-a comprehensive web-platform to simplify and automate this process, offering features for preproce...
Alternative splicing plays a major role in regulating the functional repertoire of the proteome. However , isoform-specific effects to protein-protein interactions (PPIs) are usually overlooked, making it impossible to judge the functional role of individual exons on a systems biology level. We overcome this barrier by integrating protein-protein i...
Artificial intelligence (AI) has been successfully applied in numerous scientific domains including biomedicine and healthcare. Here, it has led to several breakthroughs ranging from clinical decision support systems, image analysis to whole genome sequencing. However, training an AI model on sensitive data raises also concerns about the privacy of...
Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. Various studies exist about the molecular mechanisms of viral infection. However, such information is spread across many publications and it is very time-consuming to integrate, and exploit. We develop CoVex, an interactive online platform for SARS-CoV-2 ho...
Lifestyle, obesity, and the gut microbiome are important risk factors for metabolic disorders. We demonstrate in 1,976 subjects of a German population cohort (KORA) that specific microbiota members show 24-h oscillations in their relative abundance and identified 13 taxa with disrupted rhythmicity in type 2 diabetes (T2D). Cross-validated predictio...
Manipulating molecules that impact T cell receptor (TCR) or cytokine signaling, such as the protein tyrosine phosphatase non-receptor type 2 (PTPN2), has significant potential for advancing T cell-based immunotherapies. Nonetheless, it remains unclear how PTPN2 impacts the activation, survival, and memory formation of T cells. We find that PTPN2 de...
Genome-wide association studies (GWAS) have been widely used to unravel connections between genetic variants and diseases. Larger sample sizes in GWAS can lead to discovering more associations and more accurate genetic predictors. However, sharing and combining distributed genomic data to increase the sample size is often challenging or even imposs...
Objective
To facilitate shared decision-making for patients with knee osteoarthritis (OA), we aimed at building clinically applicable models to predict the individual change in pain intensity (VAS scale 0 -100), knee-related quality of life (QoL) (KOOS QoL score 0-100) and walking speed (m/sec) immediately following two educational and 12 supervise...
A current challenge in genomics is to interpret non-coding regions and their role in transcriptional regulation of possibly distant target genes. Genome-wide association studies show that a large part of genomic variants are found in those non-coding regions, but their mechanisms of gene regulation are often unknown. An additional challenge is to r...
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need...
Drug research, therapy development, and other areas of pharmacology and medicine can benefit from simulations and optimization of mathematical models that contain a mathematical description of interactions between systems elements at the cellular, tissue, organ, body, and population level. This approach is the foundation of systems medicine and pre...
Coronavirus Disease-2019 (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus. It was first identified in Wuhan, China, and has since spread causing a global pandemic. Various studies have been performed to understand the molecular mechanisms of viral infection for predicting drug repurposing candidates. However, such information is s...
Simulated data is crucial for evaluating epistasis detection tools in genome-wide association studies. Existing simulators are limited, as they do not account for linkage disequilibrium (LD), support limited interaction models of single nucleotide polymorphisms (SNPs) and only dichotomous phenotypes, or depend on proprietary software. In contrast,...
Plants are essential for life and are extremely diverse organisms with unique molecular capabilities¹. Here we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. Our analysis provides initial answers to how many genes exist as proteins (more than 18,000), where t...